A final project done for Introduction to Data Science course. In a group of four, we were assigned to pick a topic we found interesting, find a dataset on it, and produce a final report consisting of an introduction, data description, preregistration statement, data analysis (using data science/statistics), evaluation of significance, and interpretation and conclusions.
For this project, we looked to explore how age and wealth affect the mariage rates in the UK, Italy, Germany, Austria, Switzerland, Japan, Mexico, Canada, US, and Australia. We further expanded our project to explore how other factors such as cohabitation and birth outside of marriage affect the marriage rates in the US. We then compared the resutls within and between each country previously listed, and we also looked at marriage rates between countries with similar population sizes and counntries in different continents.
Our results supported that:
- Countries within greater proximities will exhibit higher correlations between average marriage age and marriage rates.
- Countries with similar population sizes will exhibit higher correlations between average marriage age and marriage rates, although not as high compared to countries within greater proximities.
This project helped me practice skills including data cleaning using Python and SQL, data analysis and visualization, hypothesis tests, correlation matrices, and regression.