You have at your disposal a lot of complicated data. For sure you want to find there important informations to understand the phenomenon the data describe. You count that this informations will lead to improving important processes within your company.

During the training you will learn how prepare data to analysis and next use appropriate methods to extract informations useful for your business. We will use R for this purpose.

Statistical system R ia a powerful and free tool. It is applied virtually in all fields of business and science. If you haven’t heard about R then learn why it is profitably to know R.


What will you learn?

  • Understand methodology of data analysis with use fo data mining methods.
  • Get useful practical skills.
  • Learn how to prepare your data for further analysis.
  • Data exploration and predictive models building methods: classification methods (used for development of predictive models), cluster analysis methods (discovering segments of similar clients), dimensionality reduction methods (graphical presentation of multidimensional data) and methods of discovering association rules (market basket analysis).
  • Get to know how assess predictive power of built predictive models.
  • You will work on these topics hands-on with a computer. It will take a half of the training.
  • You will get comprehensive materials and R scripts allowing you working single-handedly on your data.


For whom is this training?

Employees of departments working on data analysis and modelling (e.g. CRM, credit risk), controlling, audit, marketing, IT, and other departments who:

  • need to analyse data,
  • want to discover useful knowledge from data,
  • build predictive models.


Shortened agenda

  • Introduction to data mining and its tasks
  • Data preparation for data mining
  • Overview of data mining methods
  • Selected data visualization methods and Exploratory Data Analysis
  • Optimal selection of features and models / algorithms
  • Executing data mining projects
  • Case studies and discussions


Full agenda

  1. Introduction to data mining
    • what is data mining (DM)?
    • overview of DM applications: banking and industry, text mining, web mining
    • data mining for business needs: analysis of customers and their behaviour, CRM, decision support, possible benefits
    • main stages in Knowledge Discovery in Data
    • differences between DM, OLAP and analyses of results of databases queries
    • DM and statistics
  2. Overview of DM tasks
    • classification
    • cluster analysis
    • discovery of association rules
    • dimensionality reduction
  3. Data preprocessing for further DM analysis
    • data quality analysis
    • missing data and outliers
    • data cleaning
    • preliminary transformations
  4. Overview of DM methods
    • classification of methods / algorithms; supervised learning and unsupervised learning
    • classification methods (classification trees, k-nearest neighbours methods (k-NN))
    • cluster analysis algorithms (k-means, hierarchical methods)
    • dimensionality reduction techniques (Principal Component Analysis (PCA), Multidimensional Scaling (MDS))
    • algorithms of discovering of association rules (apriori algorithms).
  5. Selected data visualization methods and Exploratory Data Analysis (EDA)
    • one- and multidimensional graphical data analysis
    • basic tools of descriptive statistics
    • advanced algorithms used in visualization of multivariate data
    • what is EDA?
    • methods applied in EDA
  6. Optimal selection of features and models / algorithms
    • selection of important features
    • general rules of model selection
    • overfitting problem
    • compromise between complexity and effectiveness of models
    • assessment of effectiveness of applied methods / algorithms
  7. Executing projects in DM
    • popular methodologies of managing DM projects:
    • Virtuous Cycle of Data Mining
    • CRISP-DM
    • practical hints
  8. Case studies
    • examples of typical DM tasks showed using typical real world applications
  9. Discussion: possible business benefits from using data mining methods
    • a discussion with the attendants about application of data mining methods for their business needs

Try again