Big Data and Data Science, A Project on Data Analytics - A Little History on Methodologies for Data Analytics, KDD Process, CRISP-DM Methodology; Data Analytics- Types, Tools, and Applications;
Descriptive Statistics - Scale Types, Descriptive Univariate Analysis, Descriptive Bivariate Analysis; Descriptive Multivariate Analysis - Multivariate Frequencies, Multivariate Data Visualization, Multivariate Statistics, Infographics and Word Clouds; Data Quality - Missing Values, Redundant Data, Inconsistent Data, Noisy Data, Outliers.
Distance Measures - Differences between Values of Common Attribute Types, Distance Measures for Objects with Quantitative Attributes, Distance Measures for Non-conventional Attributes; Clustering Validation, Clustering Techniques - K-means, Centroids and Distance Measures, DBSCAN.
Binary Classification - Predictive Performance Measures for Classification; Distance-based Learning Algorithms - K-nearest Neighbor Algorithms, Case-based Reasoning; Probabilistic Classification Algorithms - Logistic Regression Algorithm, Naive Bayes Algorithm.
Regression and its types; DA Applications for Text, Web and social media - Working with Texts, Recommender Systems, Social Network Analysis
Reference Book:
References: 1. Dean J, ―Big Data, Data Mining and Machine learning, Wiley publications, 2014. 2. Provost F and Fawcett T, ―Data Science for Business, O‘Reilly Media Inc, 2013. 3. Janert PK, ―Data Analysis with Open Source Tools, O‘Reilly Media Inc, 2011. 4. Weiss SM, Indurkhya N and Zhang T, ―Fundamentals of Predictive Text Mining, Springer-Verlag London Limited, 2010. 5.Marz N and Warren J,- Big Data, Manning Publications,2015 6. Runkler T A, - Data Analytics: Models and Algorithms for Intelligent data analysis,Springer, 2012
Text Book:
Textbooks: 1.João Moreira, Andre Carvalho, Tomás Horvath – “A General Introduction to Data Analytics†– Wiley -2018