Skip to main content

Characterizing alternative and emerging tobacco product transition of use behavior on Twitter



The objective of this study was to develop an inductive coding approach specific to characterizing user-generated social media conversations about transition of use of different tobacco and alternative and emerging tobacco products (ATPs).


A total of 40,206 tweets were collected from the Twitter public API stream that were geocoded from 2018 to 2019. Using data mining approaches, these tweets were then filtered for keywords associated with tobacco and ATP use behavior. This resulted in a subset of 5718 tweets, with 657 manually annotated and identified as associated with user-generated conversations about tobacco and ATP use behavior. The 657 tweets were coded into 9 parent codes: inquiry, interaction, observation, opinion, promote, reply, share knowledge, use characteristics, and transition of use behavior. The highest number of observations occurred under transition of use (43.38%, n = 285), followed by current use (39.27%, n = 258), opinions about use (0.07%, n = 46), and product promotion (0.06%, n = 37). Other codes had less than ten tweets that discussed these themes. Results provide early insights into how social media users discuss topics related to transition of use and their experiences with different and emerging tobacco product use behavior.


Social media is now a common source of health-related information [1]. This includes user-generated conversations about a variety of topics, with an emerging field focused on better understanding tobacco and alternative and emerging tobacco (ATP) and electronic nicotine delivery system (ENDS) related knowledge, attitudes, and behaviors [2, 3]. User generated social media conversations can be assessed [4] to better understand how health behaviors are changing closer to real-time [5]. This approach introduces certain advantages over traditional survey methodology including faster identification of emerging trends [6]. However, methods to appropriately code social media content for specific health-related topics remain underdeveloped, particularly in the context of characterizing transitions in behaviors that change over time.

Twitter is a micro blogging social networking platform that allows users to tweet 280-character messages, which can then be retweeted, favorited, and shared across a network of online users [7]. Users can form online communities [8] by interacting with other users who share similar beliefs, interests, and opinions about topics. This includes users who initiate, use, and transition between different tobacco and ATP and ENDS products [9, 10]. In fact, Twitter has specifically become a platform for sharing information about electronic cigarettes (e-cigarettes) [11,12,13,14] a nicotine delivery device commercially available only in the past decade [15].

Evidencing growing popularity of vaping behavior, studies have shown that online searches for electronic cigarettes have increased [16]. However, increased uptake of different types of e-cigarettes (e.g., Juul, heat-not-burn, etc.), particularly among youth and young adults, has not been without controversy [17]. Ongoing concerns about the long-term health impact of nicotine consumption [18], e-cigarette-related adverse events [19] (e.g., the 2019 outbreak of e-cigarette or vaping product use-associated lung injury), [20,21,22] and mixed evidence about the efficacy of ATPs as cessation devices, continues to generate public health and patient safety concerns [23, 24]. These concerns are accentuated when trying to assess the interaction of use behavior between traditional combustible tobacco products (e.g., cigarettes, cigars) and ENDS [25].

Understanding the pathways of transition of tobacco and ATP use—including what products users initiate on, why they switch between products, and unique health harms related to dual-use (i.e., simultaneous use of both combustible and ATPs/ENDS)—is still a relatively underdeveloped area of study [26]. Hence, the objective of this study was to examine Twitter user conversations to characterize users’ conversations in relation to transition of use associated with ENDS, with a focus on developing an inductive coding approach specific to characterizing transition of use knowledge, attitudes, and behaviors.

Main text


We conducted a retrospective observational social media study in two phases: (1) data collection; and (2) content analysis using an unsupervised machine learning and inductive coding approach. Inductive content analysis was used to identify and characterize posts relevant to tobacco and ATP use (i.e., “signal” tweets) and involved manual annotation by coders with training in tobacco and substance use behavior, with results used to generate a codebook of transition and behavioral-related themes that could also be iterated on in future social media studies.

Data collection

Data was first collected from the Twitter public streaming API with a filter to collect all tweets that contained geocoded posts located in the United States, with no further language or demographical restrictions. The time period of data collection was from 07/21/2018–07/21/2019. This initial dataset of geocoded tweets was then filtered for the keywords and hashtags “vape” and “vaping” in order to better isolate relevant twitter posts associated with study aims and for purposes of preliminary data analysis about ENDS behavior. The collected data included textual content of the tweet, user and account information, URLs, and time and date of post.

Data mining

To identify themes in our full corpus of tweets, we used an unsupervised machine learning approach called the Biterm Topic Model (BTM) designed to detect patterns in data and summarize the entire corpus of tweets into distinct highly correlated categories [27]. BTM is used to sort short text into highly prevalent themes without the need for predetermined coding or training and has been previously used for exploration of key public health topics [28,29,30,31]. For each topic, BTM generates the top 20 words that represent the topic cluster. These topics were then reviewed and selected to identify clusters of Twitter conversations relevant to vaping and transition of use. Using BTM, we are able to identify “signal” topics based on the BTM output and eliminate irrelevant topics. BTM topics were first generated after applying keyword filters and were included for further analysis if they were pertinent to vape and vaping behavior, topics were excluded if they contained irrelevant topics or appeared to correlate with non-user generated conversations (e.g., news tweets, etc.)

We then extracted all the posts from the select vaping BTM topics and manually coded the content of tweets in these topics to ensure relevance to user-generated tobacco and ENDS use behavior. Posts were excluded as signals if they were: (1) news related and not organically user-generated content; (2) not written in English; and (3) retweets, the tweets that were retweeted counted as only one tweet. However, all tweets, replies, and tweets containing photos or videos were included to assess additional contextual information in addition to content analysis of text of tweet. Transition of use was classified as switching from one tobacco or ATP/ENDS product to another.

Content analysis

Tweets and any associated URLs/hyperlinks were aggregated into a table and imported into Atlas.ti qualitative software for content analysis [32]. A first iterative, inductive analysis of the data was conducted (JSY) to identify thematic areas and classify tweets into codes with code descriptions. Tweets were read for identification of thematic areas in the dataset, then coded based on thematic areas of interest. Codes and coding descriptions were developed and modified iteratively throughout the coding process. A second analysis of the dataset was undertaken to expand the codebook to include subcodes. Subcodes and subcode descriptions were created and modified iteratively during a second round of data coding. Once a coding scheme was developed, the data were coded, extracted, and reviewed to assess the validity of the coding scheme by a second coder (CB). The final coding scheme and distribution of codes is presented in Fig. 1 and Table 1.

Fig. 1

Tweets per tobacco and ATP theme

Table 1 Emergent coding scheme for Twitter data for tobacco and ATP use behavior (examples de-identified by paraphrasing)

Ethics and data collection

Data was collected from the Twitter public API stream and included publicly available tweets that were filtered for posts with geolocation/geotagged information. As the study did not involve human subjects, involved no interactions with online users, and only used publicly available data that was further de-identified for research purposes, ethics, and IRB approval was not required and twitter users were not consented into this study [33]. Any user identifiable information was removed from the study results.


A total of 40,206 tweets were collected after filtering for “vape” and “vaping” keywords/hashtags. After data filtering, we ran BTM on the keyword filtered data to generate topic clusters and reviewed them for relevance to study aims. We chose 16 BTM clusters, which comprised a total of 5728 (14.25%) tweets selected based on word groupings relevant to vaping and ATP/ENDS behavior terms. After manually annotating these tweets for characteristics relevant to tobacco and ATP/ENDS use and behavior, we removed all non-signal tweets, leaving 589 signal tweets related to transition of use that were further analyzed. The 589 signal posts were categorized into 10 tobacco/ATP/ENDS general use and behavior thematic codes listed and identified in Table 1.

Specific to codes related to transition of use (48.39%, n = 285), thirteen distinct tobacco/ATP/ENDS transition pathways were identified; the term “vaping” was used to describe both nicotine vaping and vaping of cannabis-based products. Transitions detected were cannabis to cannabis (0.005%, n = 3), cannabis to e-cigarettes (0.006%, n = 4), chewing tobacco to e-cigarettes (0.01%, n = 6), cigarettes to e-cigarettes (27.16%, n = 160), cigarettes to no product (0.14%, n = 8), cigarettes to vape cannabis (0.007%, n = 4), e-cigarette to cannabis (0.002, n = 1), e-cigarette to cigarette (0.01%, n = 7), e-cigarette to e-cigarette (0.2%, n = 13), e-cigarette to no product (0.07%, n = 43), no product to e-cigarette (0.03%, n = 17), no product to vape cannabis (0.03, n = 15), and unknown product to e-cigarette (0.007%, n = 4).

There were also transitions among different ATP product types as well as cannabis product types, one of which was vaping a cannabis product. Vaping use factors that were observed as influencing transition of use included self-reporting of addiction prompting use, reaction to adverse symptoms, cost of ATPs/ENDS, faulty or broken ATPs/ENDS, preference for flavors, losing or misplacing ATPs/ENDS, interest in polysubstance use, concern about reducing nicotine levels, stigma, and the alleged therapeutic effects of vaping, especially cannabis.


This study explored user-generated conversations occurring on Twitter in relation to tobacco and ATP/ENDS use, with a specific focus on transition of use between these highly addictive products. We observed that this subset of Twitter users actively tweeted about their experience using tobacco and ATPs/ENDS, representing powerful information about this behavior that is influenced by a changing landscape of new and emerging nicotine products. The majority of tweets reviewed related to tobacco and ATP/ENDS use and behavior characteristics, including users asking about tobacco/ATP/ENDS products, how to quit, observations of tobacco/ATP/ENDS use behavior, opinions about products and vaping (including claiming vaping as a healthier alternative to tobacco or its alleged therapeutic benefits), sharing knowledge about tobacco/ATP/ENDS products, and specific characteristics of use (e.g. addiction, adverse events, costs, flavoring, tricks, etc.)

Close to half of all conversations discussed transition of use behavior, including users actively discussed the types of tobacco/ATP/ENDS products used and switched between, as well as provided reasons for product use change. A wide variety of tobacco/ATP/ENDS products were mentioned, including combustible tobacco products (e.g., cigarettes), chewing tobacco, different types of e-cigarettes (Juul, vaping pens, etc.) and cannabis smoking products. Transition was observed between different products and within specific product classes (i.e., transitioning from one type of e-cigarette product to another), with some users (n = 32) self-reporting polytobacco and polysubstance behavior (e.g., smoking cigarettes and also vaping). Users expressed various sentiment about different products including how products could act as substitutes for others, what products made them feel better, attempts to quit use of one product by switching to another, and issues related to cost and access. Some users stated that cannabis vaping products helped them with cessation of nicotine addiction.

Based on these preliminary results, Twitter appears to enable robust conversation and sharing of information related to tobacco and ATP/ENDS use and can act as a digital forum for smokers and vapers to accumulate knowledge, share experiences, and actually lead to potential behavior change associated with nicotine use and addiction.


The results of our study are exploratory in nature and were derived from a sample of general geolocated tweets over a one-year period, which were then filtered for common vaping keywords and then analyzed using unsupervised machine learning. The results of this study are not generalizable to overall trends in tobacco or ATP/ENDS behavior, but nevertheless provide important insights into conversations occurring among Twitter users specific to transition of tobacco and nicotine product use. Themes associated with the transition of use were primarily focused on navigating quit attempts or having trouble quitting in the past, those who had relapsed to nicotine addiction, and those who had quit cigarettes but still vaped. These results provide early evidence that experiences in transition of use also present opportunities for more targeted cessation interventions, particularly in the context of increasing knowledge of known health harms related to tobacco use and nicotine addiction and exposure [34, 35]. Future work should conduct further confirmatory studies to assess if themes related to transition of use knowledge, attitudes and behaviors observed hold true in other digital communities and use more structured research approaches to generalize findings. Future studies should also examine other platforms now popular among youth and young adults, such as Instagram, Snapchat, and TikTok.


This study was exploratory and meant to generate hypotheses for future research. The study’s limitations include use of a single platform and that Twitter user demographics may not reflect that of the general population of tobacco/ATP/ENDS users. The sample of tweets were also limited based on a convenience sample generated from geocoded tweets, and hence, may be subject to sample bias as it is estimated that only 1% of all tweets are geocoded [36, 37]. Future studies should use multiple Twitter APIs to generate a more representative Twitter dataset.

Availability of data and materials

The de-identified data that support the findings of this study are available upon request to corresponding author and certain data will be available freely from the website


  1. 1.

    Grajales FJ, Sheps S, Ho K, Novak-Lauscher H, Eysenbach G. Social media: a review and tutorial of applications in medicine and health care. J Med Internet Res. 2014;16(2):e13.

    Article  Google Scholar 

  2. 2.

    Giustini DM, Ali SM, Fraser M, Boulos MNK. Effective uses of social media in public health and medicine: a systematic review of systematic reviews. OJPHI. 2018; 10(2). Accessed 15 Dec 2020.

  3. 3.

    Lazard AJ, Wilcox GB, Tuttle HM, Glowacki EM, Pikowski J. Public reactions to e-cigarette regulations on Twitter: a text mining analysis. Tob Control. 2017;26(e2):e112–6.

    Article  Google Scholar 

  4. 4.

    Kim A, Miano T, Chew R, Eggers M, Nonnemaker J. Classification of twitter users who tweet about E-cigarettes. JMIR Public Health Surveill. 2017;3(3):e63.

    Article  Google Scholar 

  5. 5.

    Maher CA, Lewis LK, Ferrar K, Marshall S, De Bourdeaudhuij I, Vandelanotte C. Are health behavior change interventions that use online social networks effective? A systematic review. J Med Internet Res. 2014;16(2):e40.

    Article  Google Scholar 

  6. 6.

    Allem J-P, Ferrara E, Uppu SP, Cruz TB, Unger JB. E-cigarette surveillance with social media data: social bots, emerging topics, and trends. JMIR Public Health Surveill. 2017;3(4):e98.

    Article  Google Scholar 

  7. 7.

    Edo-Osagie O, De La Iglesia B, Lake I, Edeghere O. A scoping review of the use of Twitter for public health research. Comput Biol Med. 2020;122:103770.

    Article  Google Scholar 

  8. 8.

    Mogul DB, Nagy PG, Bridges JFP. Building stronger online communities through the creation of facebook-integrated health applications. JAMA Pediatr. 2017;171(10):933.

    Article  Google Scholar 

  9. 9.

    Katz MS, Anderson PF, Thompson MA, Salmi L, Freeman-Daily J, Utengen A, et al. Organizing online health content: developing hashtag collections for healthier internet-based people and communities. JCO Clin Cancer Inform. 2019;3:1–10.

    Article  Google Scholar 

  10. 10.

    The National Dental PBRN Collaborative Group comprises practitioners, faculty and staff who contributed to this activity. A list of these persons is at, Cutrona SL, Sadasivam RS, DeLaughter K, Kamberi A, Volkman JE, et al. Online tobacco websites and online communities—who uses them and do users quit smoking? The quit-primo and national dental practice-based research network Hi-Quit studies. Behav Med Pract Policy Res. 2016;6(4):546–57.

  11. 11.

    McCausland K, Maycock B, Leaver T, Wolf K, Freeman B, Jancey J. E-cigarette advocates on twitter: content analysis of vaping-related tweets. JMIR Public Health Surveill. 2020;6(4):e17543.

    Article  Google Scholar 

  12. 12.

    McCausland K, Maycock B, Leaver T, Wolf K, Freeman B, Thomson K, et al. E-cigarette promotion on twitter in Australia: content analysis of tweets. JMIR Public Health Surveill. 2020;6(4):e15577.

    Article  Google Scholar 

  13. 13.

    Lazard AJ, Saffer AJ, Wilcox GB, Chung AD, Mackert MS, Bernhardt JM. E-cigarette social media messages: a text mining analysis of marketing and consumer conversations on twitter. JMIR Public Health Surveill. 2016;2(2):e171.

    Article  Google Scholar 

  14. 14.

    Kim AE, Hopper T, Simpson S, Nonnemaker J, Lieberman AJ, Hansen H, et al. Using twitter data to gain insights into e-cigarette marketing and locations of use: an infoveillance study. J Med Internet Res. 2015;17(11):e251.

    Article  Google Scholar 

  15. 15.

    Dinardo P, Rome ES. Vaping: the new wave of nicotine addiction. CCJM. 2019;86(12):789–98.

    Article  Google Scholar 

  16. 16.

    Ayers JW, Althouse BM, Allem J-P, Leas EC, Dredze M, Williams RS. Revisiting the rise of electronic nicotine delivery systems using search query surveillance. Am J Prev Med. 2016;50(6):e173–81.

    Article  Google Scholar 

  17. 17.

    Fadus MC, Smith TT, Squeglia LM. The rise of e-cigarettes, pod mod devices, and JUUL among youth: factors influencing use, health implications, and downstream effects. Drug Alcohol Depend. 2019;201:85–93.

    Article  Google Scholar 

  18. 18.

    DeVito EE, Krishnan-Sarin S. E-cigarettes: impact of E-liquid components and device characteristics on nicotine exposure. CN. 2018;16(4):438–59.

    CAS  Article  Google Scholar 

  19. 19.

    Winnicka L, Shenoy MA. EVALI and the pulmonary toxicity of electronic cigarettes: a review. J Gen Intern Med. 2020;35(7):2130–5.

    Article  Google Scholar 

  20. 20.

    Kalininskiy A, Bach CT, Nacca NE, Ginsberg G, Marraffa J, Navarette KA, et al. E-cigarette, or vaping, product use associated lung injury (EVALI): case series and diagnostic approach. Lancet Respir Med. 2019;7(12):1017–26.

    CAS  Article  Google Scholar 

  21. 21.

    Blount BC, Karwowski MP, Shields PG, Morel-Espinosa M, Valentin-Blasini L, Gardner M, et al. Vitamin E acetate in bronchoalveolar-lavage fluid associated with EVALI. N Engl J Med. 2020;382(8):697–705.

    CAS  Article  Google Scholar 

  22. 22.

    Morgan JC, Silver N, Cappella JN. How did beliefs and perceptions about e-cigarettes change after national news coverage of the EVALI outbreak? Cummings M, editor. PLoS ONE. 2021;16(4):e0250908.

    CAS  Article  Google Scholar 

  23. 23.

    Ghosh S, Drummond MB. Electronic cigarettes as smoking cessation tool: are we there? Curr Opin Pulm Med. 2017;23(2):111–6.

    Article  Google Scholar 

  24. 24.

    Rehan HS, Maini J, Hungin APS. Vaping Versus Smoking: A Quest for Efficacy and Safety of E-cigarette. CDS. 2018;13(2):92–101.

    CAS  Article  Google Scholar 

  25. 25.

    Wang JB, Olgin JE, Nah G, Vittinghoff E, Cataldo JK, Pletcher MJ, et al. Cigarette and e-cigarette dual use and risk of cardiopulmonary symptoms in the Health eHeart Study. PLoS ONE. 2018;13(7):e0198681.

    Article  Google Scholar 

  26. 26.

    Worku D, Worku E. A narrative review evaluating the safety and efficacy of e-cigarettes as a newly marketed smoking cessation tool. SAGE Open Med. 2019;7:205031211987140.

    Article  Google Scholar 

  27. 27.

    Yan X, Guo J, Lan Y, Cheng X. A biterm topic model for short texts. In: Proceedings of the 22nd international conference on World Wide Web - WWW ’13. Rio de Janeiro, Brazil: ACM Press; 2013 . p. 1445–56. Accessed 29 Mar 2021.

  28. 28.

    Kalyanam J, Katsuki T, Lanckriet G, Mackey TK. Exploring Trends of Nonmedical use of Prescription Drugs and Polydrug Abuse in the Twittersphere Using Unsupervised Machine Learning. Addict Behav. 2017;65:289–95.

    Article  Google Scholar 

  29. 29

    Li J, Chen W-H, Xu Q, Shah N, Kohler JC, Mackey TK. Detection of self-reported experiences with corruption on twitter using unsupervised machine learning. Soc Sci Human Open. 2020;2(1):100060.

    Google Scholar 

  30. 30.

    Haupt MR, Jinich-Diamant A, Li J, Nali M, Mackey TK. Characterizing twitter user topics and communication network dynamics of the “liberate” movement during COVID-19 using unsupervised machine learning and social network analysis. Online Social Netw Media. 2020;21:100114.

    Article  Google Scholar 

  31. 31.

    Mackey TK, Purushothaman V, Haupt M, Nali M, Li J. Application of unsupervised machine learning to identify and characterize hydroxychloroquine misinformation on twitter. Lancet Digital Health. 2021;3(2):e72-75.

    Article  Google Scholar 

  32. 32.

    Weber R. Basic Content Analysis. 2455 Teller Road, Thousand Oaks California 91320 United States of America: SAGE Publications, Inc.; 1990. Accessed 25 Mar 2021.

  33. 33.

    Solberg, Lauren, Data Mining on Facebook: A Free Space for Researchers or an IRB Nightmare? (November 28, 2010). Journal of Law, Technology and Policy, Vol. 2010, No. 2, 2010, SSRN:

  34. 34.

    Kazemzadeh Z, Manzari ZS, Pouresmail Z. Nursing interventions for smoking cessation in hospitalized patients: a systematic review. Int Nurs Rev. 2017;64(2):263–75.

    CAS  Article  Google Scholar 

  35. 35

    Gotts JE, Jordt S-E, McConnell R, Tarran R. What are the respiratory effects of e-cigarettes? BMJ. 2019;1:l5275.

    Article  Google Scholar 

  36. 36.

    Ajao O, Hong J, Liu W. A survey of location inference techniques on Twitter. J Inf Sci. 2015;41(6):855–64.

    Article  Google Scholar 

  37. 37.

    Bakerman J, Pazdernik K, Wilson A, Fairchild G, Bahran R. Twitter geolocation: a hybrid approach. ACM Trans Knowl Discov Data. 2018;12(3):1–17.

    Article  Google Scholar 

Download references




This research was supported by the Tobacco-Related Disease Research Program (Awards #T29IP0465 and #T29IP0384).

Author information




CB, JY, JL, and TKM jointly conceived the study, drafted the study, conducted data collection and analysis, and wrote and agreed to the final version of this manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tim K. Mackey.

Ethics declarations

Ethics approval and consent to participate

Not applicable/Not required for this study. All information collected from this study was from the public domain and the study did not involve any interaction with users. Any user identifiable information was removed from the study results.

Consent for publication

Not applicable.

Competing interests

TKM, JL and CB are employees of the startup company S-3 Research LLC. S-3 Research is a startup funded and currently supported by the National Institutes of Health – National Institute on Drug Abuse through a Small Business Innovation and Research contract for opioid-related social media research and technology commercialization. Author reports no other conflict of interest associated with this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bardier, C., Yang, J.S., Li, J. et al. Characterizing alternative and emerging tobacco product transition of use behavior on Twitter. BMC Res Notes 14, 303 (2021).

Download citation


  • Tobacco behavior
  • Electronic cigarettes
  • Social media
  • Twitter
  • Qualitative research