Statistics and Data Science

Back To Course

Next Module

Parent Programme

Bachelor in Computing (Level 7 NFQ)

NFQ Level & Reference

Level 7 / Ref: M3.2

Duration

12 Weeks X 3 Hours per week

MODULE TITLE

Statistics and Data Science

STAGE

Award

Module Credit Units

ECTS: 5

Statistics and Data Science Module

Introduction to Statistics and Data Science

The aim of this Statistics and Data Science module is to equip the learner with the tools and methods to find structure and to give deeper insight into data and to analyse and quantify uncertainty. Learners will understand how data science can help organisations to reduce costs, make more informed decisions and develop new products and services.

Indicative Syllabus Content

Statistics and Data Science

Introduction

Study types: observational studies and experiments; causation
Gathering data: primary & secondary sources; questionnaires; experimental design
Population and sample; parameters and statistics; sampling methods
Variables and observations
Bias
Descriptive vs. inferential statistics

Exploratory data analysis (EDA):

Types of variables: categorial, numerical
Measurement scales: nominal, ordinal, discrete, continuous
Types of graphs: bar charts, histograms, boxplots, time series plots, scatterplots, etc.
Summarising distributions: measures of centre (mean, median and mode); measures of variance (range, interquartile range, standard deviation); measures of skewness
Descriptions of a distribution: shape; modality; outliers
Contextualising EDA using industry software: examples using Python

Probability

Random experiments; outcomes; sample spaces; (simple & compound) events
Classical, relative frequency, and subjective probability approaches
Calculating simple probabilities; marginal and conditional probabilities
Mutually exclusive events; dependent and independent events; multiplication rule; addition rule; Bayes’ theorem
Central limit theorem
Contextualising probability using industry software: examples using Python

Discrete Distributions

Probability mass functions (PMFs)
Bernoulli random variables
Binomial distribution
Poisson distribution
Contextualising distributions using industry software: examples using Python

Continuous Distributions

Cumulative distribution functions (CDFs) and probability density functions (pdfs)
Exponential distribution
Normal distribution: properties, z-scores, normal tables
Contextualising distributions using industry software: examples using Python

Statistical Inference

Point estimates; standard error
Confidence intervals
Testing theories; Introduction to hypothesis testing
Testing a mean or proportion (one-sample tests)
Testing the difference between two means or two proportions (two-sample tests)
Chi-squared tests
Sample size and power
Contextualising hypothesis tests using industry software: examples using Python

Making predictions

Correlation
Regression line; regression analysis; least squares fit; producing predictions
Multiple regression and non-linear regression
Model validation; outliers; influential observations
Contextualising predictions using industry software: examples using Python

Data science in context

Report writing
Statistics in business: legal issues, ethics, GDPR, privacy
Introduction to advanced topics, such as: data warehouses, machine learning, data mining, big data

Minimum Intended Learning Outcomes (MIMLOs)

Upon successful completion of this module, the learner should be able to:

MIMLO1

Examine data of various types with consideration of data gathering and transformation.

MIMLO2

Analyse data using statistical and probability techniques.

MIMLO3

Test hypotheses regarding means, proportions, and differences between two means or proportions using appropriate statistical tests.

MIMLO4

Outline data in the form of data visualizations and reports.

MIMLO5

Differentiate between ethical considerations, legal requirements and evidence-based reasoning when making decisions.

Assessment

MIMLOs

Assessment

Percentage

1, 2, 3, 4, 5

CA1, CA 2, - In Class Written Assessments

Total 100%

CA 3 - Examination

All Assessments

Reassessment Opportunity

Where the combined marks of the assessment and examination do not reach the pass mark the learner will be required to repeat the element of assessment that they failed. Reassessment materials will be published on Moodle after the Examination Board Meeting and will be aligned to the MIMLOs and learners will be capped at 40% unless there are personal mitigating circumstances.

Aims & Objectives

This Statistics and Data Science module will ensure learners meet the following objectives:

Develop an understanding of data collection and the types of studies.
Assembling data through data cleaning and transformation.
Undertake preliminary data analysis using graphs and descriptive and inferential statistics.
Identify and develop the model that bests fits the problem requirement.

[TheChamp-Sharing]

Apply Now