Modified sharp regression discontinuity model to settings with fuzzy variables

Objective The goal of this study is to develop a Modified Sharp Regression Discontinuity model to predict alcohol consumption in People Living with Human Immunodeficiency Virus (HIV) and Acquired Immunodeficiency Syndrome (AIDS). Previous studies focused on either fuzzy dependent or fuzzy independent variables separately. However, there is a gap in research that examines the interaction between both types of fuzzy variables thus the model considers both dependent and independent fuzzy variables. Methods A statistical model was developed to predict the relationship between alcohol consumption and HIV progression. The model equations are solved numerically using parametric estimation. Results In simulation studies, as the sample size expanded, the estimates derived from the modified sharp regression discontinuity model exhibited probabilistic convergence towards the true value, thereby validating the estimator of the Average Causal Effect’s consistency. Counseling has an average causal effect in the sharp Regression Discontinuity Design (RDD) for compliers that is roughly equal to 0.199. This was the variation in Alcohol Use Detective Identification Test (AUDIT) threshold scores or the change in intercept scores when counseling was effective. Following six months of participation in the counseling program, AUDIT scores decreased, leading to an increase in Cluster of Differentiation 4 (CD4) counts and a decrease in viral loads. Conclusion The Modified Sharp RDD offers a robust approach to handle fuzzy variables in causal inference. Our study contributes to the advancement of RDD methodology and its applicability in real-world settings with uncertain data.


Introduction
In this research study, a total of 234 individuals were initially recruited.However, during the follow-up at 3 and 6 months, 44 participants were lost, resulting in a final sample size of 190 participants.The data for this study was collected from 16 HIV care clinics in Zimbabwe, which were selected using cluster sampling.These public health facilities were chosen on a national scale using a combination of stratified randomization and random allocation methods.Participants who scored above 15 or equal to 15 on the AUDIT (Alcohol Use Disorders Identification Test) were identified as individuals with alcohol dependence and received counseling sessions.
The prediction of alcohol consumption in People Living with Human Immunodeficiency Virus (HIV) and Acquired Immunodeficiency Syndrome (AIDS) is a significant concern for public health.Sharp Regression Discontinuity Design (RDD) has proven useful for analyzing causal effects in such settings.However, traditional RDD models assume crisp data, while realworld data often involve uncertainty or fuzziness.To address this limitation, a Modified Sharp Regression Discontinuity model that accounts for both dependent and independent fuzzy variables is proposed.
The regression discontinuity design (RDD) is said to be sharp if the likelihood of being treated grows from zero to one, otherwise, it is said to be fuzzy.RDD is crisp when all individuals receive the planned therapy [1].
According to [2], in a strong regression discontinuity, the likelihood of receiving treatment jumps deterministically from 0 to 1 at the cut-off.Everyone on one side of the cutoff gets treatment, but no one on the other gets it.In the abrupt regression discontinuity, the treatment impact is assessed by comparing outcomes for those immediately above and below the cutoff.
In the context of HIV and AIDS patient data, fuzzy variables play a crucial role, particularly when dealing with viral load and immune system strength.The fuzzy nature of these variables makes the traditional RDD less effective in capturing the complex relationships within the data [3].To address this limitation, our study proposed a Modified Sharp RDD that can handle both dependent and independent fuzzy variables.Using this model with an AUDIT score-based cutoff, we investigated the effects of alcohol usage on viral loads via CD4 counts in HIV and AIDS patients (PLWHA).An AUDIT score of 15 is regarded as the cutoff.
The immune system's strength or weakness determines the dependent variable viral load decrease or increase in the body, which differs from person to person.T-cell depletion is less common in patients or anyone with a healthy immune system, but it is more common in persons who have a weak immune system.This implies that the variable viral load is uncertain or fuzzy.As a result, the amount or quantity of immune cells and the HIV viral load at various stages of the disease can be regarded as unclear [3].

Methods
In the study, a statistical model for predicting alcohol consumption and HIV progression is formulated.We considered a case in which AUDIT scores and CD4 counts are considered fuzzy.The Modified Sharp Regression Discontinuity model incorporates imprecise observations.Studies have considered cases in which the observations are clear.In actual fact this is not always the case since the variables involved will be fuzzy at times.
To calculate the result variable's discontinuity at the cut-off point.This is equal to the Average Causal Effect (ACE), i.e., The treatment impact for a specific subpopulation We can define conditional means from the right and left respectively as follows: These two can be estimated separately and then take the difference or just find the difference between them at once.Considering the earlier we can solve as well as which results in and The difference between estimates in Eq.(8) and Eq. ( 7) is the size of the discontinuity that is Or alternatively we can run and use the Ordinary Least Squares to estimate the parameters.We had a classical regression of the form: (1) Since the dependent and independent variables are fuzzy, we have a model of the form where y ij * and x ij * are fuzzy observations with membership functions defined as µ y ij (y) and µ x ij (x) respectively.e * * ij was the fuzzy error associated with the model.This error term can be estimated using the idea by [4].NB: Please note that for sharp regression discontinuity  16) and Eq. ( 17) when As a result to calculate the treatment effect, E(y ij

Implementation of the modified sharp regression discontinuity model
With perfect compliance (100%), we measured the size of the leap at the cutoff, indicating a sharp regression discontinuity.All participants above the cutoff received treatment, while those below it did not.The graphical representation is shown in Fig. 1.

Check for discontinuity in the running variable around cutoff point in sharp RDD
To see if the running variable was manipulated, imagine there was a large number of participants clustered (13) around 15 due to how the AUDIT was administered; that is, respondents wanted to get into the program, so they purposefully answered some questions incorrectly.
In this case, we did so by making a histogram of the running variable (AUDIT scores) and looking for any significant jumps around the threshold.There was a very slight visible difference in the height of the bars before and after the 15-score cutoff, so there doesn't appear to be a jump around the threshold in this case.Using a McCrary density test to determine whether that jump was statistically significant, the overlap's p value is equal to 0.47, indicating that there is no significant difference near the threshold [5].

Check for discontinuity in the outcome variable across running variable in sharp RDD
We can finally see if there was a discontinuity in the final AUDIT scores based on participation in the counseling program because this is a sharp regression discontinuity design and there was no bunching of AUDIT scores around the 15-point threshold.Figure 2 clearly shows a discontinuity.It appeared that taking part in the counseling program improved the CD4 counts and, consequently, the final AUDIT scores.For the case of AUDIT scores, the improvement can be explained in form of the graph in Figure 3.

Estimation of the size of the effect in sharp RDD
The coefficient, which takes into account the counseling program, is the one we are most concerned about.This causal effect of counseling for compliers in sharp RDD is approximately equal to 0.199, which is lower than that in fuzzy RDD [6].In other words a small effect in the positive direction [7].This was the change in intercept when counseling was accurate or the variance in AUDIT threshold scores.Participating in the counseling program raised AUDIT scores after six months, which raised CD4 counts and lowers viral loads.

Simulation of estimates based on modified sharp regression discontinuity model
In this section, we performed simulations to assess the asymptotic properties of the proposed methodology.For this model, using a cutoff x ′ = 15 , the parameters were set at ν = 10 , β = 0.5 and γ = 0.3 where ν = α + δ and η = (ν, β, γ ) T .Based on 1000 replications, table 1 shows the simulated estimates when n=30, n=50, n=100 and n=200.
The subsequent curves in Modified Sharp RDD also exhibited a bell-shaped distribution as in Figure 4.
As the sample size grew, the estimates derived using the modified sharp regression discontinuity model converged in probability to the true value, proving the consistency of the estimator.Asymptotically consistent   Based on the results from simulation, we can conclude that the estimates from both the Modified Fuzzy and Sharp regression discontinuity models are asymptotically consistent and follow a normal distribution [6].

Discussion
Our study presents a Modified Sharp Regression Discontinuity model that successfully addresses the challenges posed by fuzzy variables in HIV and AIDS patient data.By incorporating both fuzzy independent and dependent variables, our model offers a more accurate prediction of alcohol consumption and its impact on HIV progression.
In comparing our results to similar studies conducted by other researchers, we find that our Modified Sharp RDD provides more robust estimates for the causal effect of counseling on alcohol consumption among patients living with HIV and AIDS [6].The estimator demonstrated consistency and asymptotic convergence towards the true value, validating its reliability.
The findings of our simulation studies support the claim that the Modified Fuzzy and Sharp RDD models are asymptotically consistent and follow a normal distribution as discussed by [6].These results hold promise for improved causal inference in settings with fuzzy variables.
However, we acknowledge some limitations in our study.The external validity of our findings may be restricted due to the inclusion of data from a specific region (Zimbabwe).Therefore, caution should be exercised when generalizing our results to other populations.
In conclusion, our study contributes to the growing body of research on RDD models and provides a valuable framework for addressing fuzzy variables in various applications, particularly in the field of healthcare research.Further research can explore the potential of our Modified Sharp RDD in broader contexts and different datasets.

Limitations
The inclusion of data from Zimbabwe in this study restricted the external validity of the results.The findings may be reliably extrapolated to the population of PLWHA in Zimbabwe, and as a result, the internal validity is high.However, it is debatable if the Zimbabwean patients (participants) reflect the broader population of people living with HIV and AIDS in Africa and beyond.
we do not have uncompliers to the program as in fuzzy RDD.Now letting x ij * − x ′ * = x ij c and ψ = (α, β, γ , δ) T and considering the matrix we have a model in matrix form as When d ij = 0 then Eq. (14) becomes Moreover when d ij = 1 , Eq. (14) becomes Now solving Eq. (

Fig. 1 Fig. 2
Fig. 1 Compliance around the a cutoff

Fig. 3
Fig. 3 Checking for relationship between AUDIT scores baseline and AUDIT scores 6 months

Fig. 4
Fig. 4 Distribution of sharp estimate ν as the sample size increases

Table 1
Modified Sharp RDD simulated estimates