Project Overview
This project investigates the frequency and usage of Anglicisms among university students.
Anglicisms are English loanwords or English-derived expressions that are incorporated into another language,
such as German. The project focuses on how often students use these words, in which contexts they use them,
and what motivates their usage.
The analysis was conducted for the course Research Methods in Linguistics at TU Dortmund. It combines survey design,
descriptive statistics, visualization, statistical testing, logistic regression, and interpretable classification models.
Research Motivation
English has a strong influence on modern communication, especially in academic, professional, and digital contexts.
Students often encounter English words through social media, internet culture, university courses, software tools,
and international communication. This makes Anglicisms an interesting topic for studying language contact and
linguistic change.
Main research question: How frequently do university students use Anglicisms, and do usage patterns
differ by study background, language background, and communication setting?
Survey Design
The data were collected using a structured survey. The questionnaire included consent, demographic information,
language background, study program, study-program language, general Anglicism usage, usage settings, motivations,
and frequency ratings for selected Anglicisms.
- Consent: all 30 respondents agreed to participate voluntarily.
- Participants: university students from different study programs.
- Core topic: frequency and context of Anglicism usage.
- Word examples: cool, sorry, computer, handy, weekend, fastfood, shopping, party, internet, okay, email, live.
- Rating scale: from “Never” to “Frequently”.
Dataset
The final dataset contains 30 survey responses. Respondents reported demographic information such as gender,
age, native language, study program, and study-program language. They also answered questions about whether,
how often, where, and why they use Anglicisms.
| Variable Group |
Examples |
Purpose |
| Demographics |
gender, age, native language |
Describe participant background |
| Academic Background |
study program, program language |
Compare English/German/bilingual contexts |
| Usage Behavior |
daily frequency, settings, situations |
Measure how often and where Anglicisms are used |
| Motivations |
easy to use, social media, friends, language contact |
Explain why respondents use Anglicisms |
| Word-Level Ratings |
cool, sorry, computer, weekend, internet, email |
Analyze which Anglicisms are most integrated |
Descriptive Results
The respondent group was balanced by gender and relatively young, with an average age of 26.47 years.
The native-language distribution was diverse, including German, English, Turkish, and other languages.
30
Total survey responses analyzed.
26.47
Average respondent age.
96.67%
Respondents who reported using Anglicisms.
Participant Background
- Gender: 14 female respondents and 16 male respondents.
- Age range: 19 to 35 years.
- Native language: German 26.67%, English 16.67%, Turkish 23.33%, Other 33.33%.
- Study-program language: English 40%, German 26.67%, bilingual 33.33%.
Anglicism Usage Patterns
The survey results show that Anglicisms are widely used among the respondents. Most participants reported using
Anglicisms in everyday conversation, and many also reported usage in academic and professional settings.
| Usage Context |
Percentage |
Interpretation |
| Everyday Conversation |
86.67% |
Most common context for Anglicism usage. |
| Academic Settings |
53.33% |
Anglicisms are common in university-related communication. |
| Professional Communication |
30.00% |
Lower but still relevant workplace-related usage. |
| Friends |
86.67% |
Social peer communication strongly supports Anglicism usage. |
| Home |
66.67% |
Anglicisms are not limited to institutional contexts. |
Motivations for Usage
The most common motivations were ease of use, peer influence, social media, and contact between German and English.
This suggests that Anglicism usage is shaped by both practical communication needs and social/digital environments.
- Easy to use: 63.33%.
- My friends are using it: 63.33%.
- Social media: 56.67%.
- Language contact of German and English: 50.00%.
- Trending and stylish: 26.67%.
Frequently Used Anglicisms
Several Anglicisms were reported as highly integrated into respondents’ daily language. Digital and communication-related
words were especially common, which reflects the role of technology and internet culture in language contact.
| Anglicism |
Frequently Used |
Interpretation |
| Internet |
80.00% |
Most strongly integrated digital term. |
| Computer |
70.00% |
Highly common technical Anglicism. |
| Okay |
56.67% |
Common everyday communication marker. |
| Email |
46.67% |
Frequently used in digital and academic communication. |
| Party |
33.33% |
Common in social contexts. |
Statistical Methods
The project used a combination of descriptive and inferential methods. Descriptive statistics summarized the survey
responses, while t-tests and models were used to compare groups and evaluate whether study background could help
explain different Anglicism usage frequencies.
| Method |
Purpose |
Interpretation |
| Descriptive Statistics |
Summarize age, gender, language background, and usage frequencies |
Gives the basic structure of the sample |
| Density Plot |
Compare age distributions by gender |
Visualizes overlap and distributional differences |
| t-Test |
Compare Anglicism usage between English and German majors |
Tests whether group differences are statistically significant |
| Logistic Regression |
Model higher vs. lower Anglicism usage |
Predicts binary usage category from academic background |
| Decision Tree |
Classify higher vs. lower usage patterns |
Interpretable model for usage differences |
| Confusion Matrix |
Evaluate classification performance |
Shows correct and incorrect predictions |
Modeling Approach
The modeling task focused on distinguishing lower-frequency and higher-frequency Anglicism usage. Usage was grouped
into two classes: lower usage and higher usage. Bilingual students were excluded in one comparison to focus more clearly
on the difference between English-major and German-major students.
Class Definition
- Lower usage: 1 to 7 Anglicism uses per day.
- Higher usage: 8 or more Anglicism uses per day.
- Main comparison: English majors versus German majors.
Logistic regression was used to model the probability of higher Anglicism usage, while a decision tree was used for
interpretable classification and comparison of usage groups.
Model Evaluation
The decision tree model was evaluated with a confusion matrix, precision, and recall. The model detected some structure
in the usage patterns, but performance was moderate, which is expected given the small sample size.
Precision = 58.3%
Accuracy of positive predictions for higher Anglicism usage.
Recall = 63.6%
Ability to identify actual higher-usage cases.
p = 0.4475
t-test result showed no statistically significant group difference.
Confusion Matrix Summary
- German majors: 4 lower-frequency and 4 higher-frequency predictions.
- English majors: 3 lower-frequency and 9 higher-frequency predictions.
- Interpretation: English majors showed more higher-frequency usage in the model output, but statistical testing did not confirm a significant mean difference.
Key Findings
The survey indicates that Anglicisms are widely used among university students. The strongest usage contexts are
everyday conversation and communication with friends. Social media and ease of use are major motivations, and
digital vocabulary such as internet, computer, email, and okay appears highly integrated.
- Almost all respondents reported using Anglicisms.
- Anglicisms were most common in everyday and social communication.
- Social media and peer usage were major motivations.
- English majors appeared to use Anglicisms more frequently in descriptive/model-based analysis.
- The t-test did not find statistically significant evidence of a difference between English and German majors.
Limitations
The project is useful as an exploratory survey analysis, but the results should be interpreted carefully. The sample
size is small, and the respondents are from a limited university context. The classification model therefore provides
exploratory evidence rather than generalizable conclusions.
- Only 30 survey responses were available.
- The sample may not represent all university students.
- Self-reported language usage may contain recall bias.
- Grouping usage into lower and higher frequency simplifies the original response scale.
- The t-test and decision tree results were not fully aligned, suggesting caution in interpretation.
Future Work
A stronger follow-up study could use a larger sample, more balanced major groups, and additional variables such as
language proficiency, social media usage intensity, international background, and exposure to English-language courses.
- Collect a larger and more representative survey sample.
- Separate English majors, German majors, bilingual programs, and non-linguistic departments more clearly.
- Use ordinal regression instead of collapsing usage into binary categories.
- Include qualitative open-text responses for deeper linguistic interpretation.
- Compare self-reported usage with real communication data where ethically possible.
Outcome
This project strengthened my ability to design and analyze survey data, apply statistical methods to linguistic questions,
build interpretable models, and evaluate classification results using confusion matrices, precision, and recall.
It is a useful portfolio project because it combines research design, human-language data, survey analytics,
statistical testing, and interpretable machine learning in one workflow.
Survey Analysis
Research Methods
Linguistics
Anglicisms
t-Test
Logistic Regression
Decision Tree
Confusion Matrix
Python
Jupyter Notebook