WLCD: a dataset of lifestyle in relation with women’s cancer

Objectives Social media text mining has been widely used to extract information about the experiences and needs of patients regarding various diseases, especially cancer. Understanding these issues is necessary for further management in primary care. Researchers have identified that lifestyle factors such as diet, exercise, alcohol, and Smoking are associated with cancer risks, particularly women’s cancer. Considering the growing trend in the global burden of women’s cancer, it is essential to monitor up-to-date data sources using text mining. Data description We have prepared six independent datasets regarding lifestyle components and women’s cancer: (1) a dataset of nutrition containing 10,161 tweets; (2) a dataset of exercise containing 9412 tweets; (3) a dataset of alcohol containing 2132 tweets; (4) a dataset of Smoking containing 4316 tweets; and (5) a dataset of lifestyle (term) containing 1861 tweets. We also construct an additional dataset: (6) a dataset by summing other components containing 27,882 tweets. These data are provided to discover people’s perspectives, knowledge, and experiences regarding lifestyle and women’s cancer. Hence, it should be valuable for healthcare providers to develop more efficient patient management approaches.

Nowadays, the data mining of social media platforms has become an important emerging tool for understanding the experiences and needs of cancer patients.There is a wealth of information available that can be used to gain insight into the patient experience relating to lifestyle patterns [5].In a previous study, assisted with Twitter data related to breast cancer, researchers identified that physical activity and healthy eating are important factors in symptom management in cases [6].Another study by analyzing tweets related to site-specific cancers found that physical activity and alcohol consumption are among lifestyle habits that might be associated with liver and breast cancer [7].
By analyzing social media conversations, researchers can identify patterns and trends related to these factors, which can be used to develop targeted public health policies to prevent or manage cancer risk [5,8].

Objective
Cancer is one of the leading causes of mortality and morbidity worldwide.Growing trends in cancer burden, especially among women, have become a significant global health issue [1].Lifestyle factors, including unhealthy diet, physical inactivity [2], smoking, and alcohol use [3], are among the risk factors of cancer targeted for primary control.On the other hand, cancer progression and treatments might affect different aspects of lifestyle in cancer patients [4].
This approach can also help healthcare providers better address cancer patients' psychological and emotional needs [9].By analyzing online discussions, investigators can gain insights into patients' awareness and identify opportunities for providing proper support and resources associated with lifestyle modification approaches [8,10].
Social media data mining provides a unique opportunity for public health strategists to understand better people's attitudes toward the association between lifestyle and women's cancer and healthcare delivery [11].By leveraging this information, researchers and healthcare providers can develop targeted interventions that promote healthy lifestyles and improve treatment outcomes, especially among cancer patients [11,12].The main objective of this research is to provide Twitter-based datasets containing tweets related to lifestyle and women's cancer.

Limitations
• This study assessed only Twitter users that may not represent the general population.Data from other social media, such as Facebook, Instagram, and Reddit, might be needed to have more comprehensive results.• People may report inaccurate or incomplete information about their lifestyle and health status due to social undesirability.• Habits and experiences reported by users may be timely, leading to potential misinterpretation.