Scoring and predictive models

Learn, how scoring and predictive models can help your business performing better.


Many companies apply statistical models to optimize their activities.

An excellent example of these models are scoring systems applied in process of credit worthiness assessment, models used to optimize bad debt collection activities, models optimizing direct marketing or models used in CRM (Customer Relationship Management). These models are called predictive models because the allow to predict future behaviour of clients. Scoring model is a special kind of predictive models.

Predictive models can predict chance of occuring af an arbitrary phenomenon or fact of its occurence: defaulting on loan payments, occuring an accident, client churn or attrition, or buing a good.

Decision support with use of predictive models comparing to application of common sense rules or rules prepared by expert gives profit higher by 10-30%.


It is know that predictive models can predict chance of occuring af an arbitrary phenomenon or fact of its occurence: defaulting on loan payments, occuring an accident, client churn or attrition, or buing a good. It is necessary to have suitable data for building these models.

These models can be built and applied when we have two groups of objects (e.g. client) that we want to distinguish basing on their features (characteristics). What are these groups? This depends on task and kind of business. A classical example is recognizing creditors who will pay back a loan from these who will not pay basing on data from their credit application. This is an example of an application scoring.

Below are given other examples of scoring models:

  • Credit risk:
    • Forecasting of credit risk before granting a loan (application scoring)
    • Forecasting of risk for a loan already granted (behavioral scoring)
  • Detection of frauds / non-typical transactions (fraud detection)
  • Forecasting of mailing campaign answering (response scoring)
  • Selection of optimal bad debt collection activities
  • Whether a client will use a product bought? (activation scoring)
  • In what extent a client will use a product bought? (usage scoring)
  • Maybe a client will buy a product proposed to him when buying another product? (cross-selling)
  • Maybe a client will buy more of product that previously requested (e.g. will decide to have higher credit limit)? (up-selling)
  • Using a product less (attrition scoring)
  • Stopping using a product jointly with starting using another product — it is a problem often occuring in telecoms (churn)


Building a correct predictive or scoring model is a task requiring expert knowledge and big experience. Preparation of such a model often is for a company something that happenes once. Not all companies build so much models that have a need to hire a team of qualified statisticians. That’s why predictive models are often outsourced do specialized external companies having expertise in this area.

Companies being competitive for QuantUp building predictive models and analysing data usually apply only simple and standard statistical methods (so called best practices). These methods do not make optimal use of information hidden in data. In some cases these standard methods don’t allow to build a robust model at all.

Application in building of these models of advanced methods of present-day statistics (bootstrap, methods of computational statistics, data explorations methods) result in building better and more effective models.

It is crucial in situation when we have small data sets. It occurs for example when building a models assessing credit worthiness for people applying for a mortgage loan. The sample is much smaller than in case of cash or retail loans. The less data we have the more important are methods chosen to build a model. In case where data are extraordinarily gig — for example we have several hundreds of characteristics at disposal — optimal choice of methods and experience in data analysis plays a key role and is a key to success.

We know from experience that methods selected adequately to data and circumstances allow to build more powerful models. Well suited methods allow to assess uncertainty what results in reducing risk that is always connected to implementation of the model. Using better models directly increase your profit and competitiveness. The latter is particularly important in periods of economical recession.

We have particularly great experience in building scoring models, developing software for their building, and teaching how to build them. Read more about our experience.


Process of building a predictive model is complex and multi-stage:

  • defining of parameters:
    • agreeing upon good and bad client definition or definition of groups of clients for model to distinguuish between,
    • defining application window and performance window,
    • defining exclusions,
  • data pre-processing:
    • preliminary choice of characteristics for building a model,
    • gathering and cleaning data (including missing data imputation and outlier detection and handling),
    • analysis, coarse classing, and fine classing of characteristics,
    • building of derivative characteristics,
    • additional transformations of characteristics (optional),
  • selection of best features to build a model (using statistical methods and basing on business logic, knowledge, understanding),
  • building a model and comprehensive assessment of its predictive power (using statistical tests, bootstrap and similar methods),
  • analysis of potential segmentations — is enough to have a single model or it is better to split data into segments and build a model for each of the segments (optional),
  • interaction analysis (optional),
  • analysis of quality of data regarding rejected applications and taking them into account — reject inference procedure (optional),
  • preparation of reports summarizing process of building of the model.

We characterize ths process in more details in program of our course in building scoring models. You can read there about methods we use to build scoring models.

Validation of the model is very important (it contains assessment of quality and adequacy of the model):

  • during building of a model,
  • just after completing the model — before implementing it,
  • during applying the model.

You may read in details about the methods we use in validation in programm of a course in validation of scoring and rating models.

Steps done, selected statistical methods and complication of the validation process depend of credit portfolio specificity, credit process specificity, size of the construction sample and kind of scoring (application / behavioural). It may be necessary to apply reject inference methods and sophisticated methods for small samples.


A model is delivered as a score table, as rules (if this is a classification tree) or as PMML file — depending on model kind and needs.

We deliver a documentation containing:

  • description of process of building model,
  • assumptions made,
  • justification and explanation of methodological and analytical decisions taken in the modelling process,
  • short information about used analysis and modelling methods,
  • description and characterization of variables preliminarly chosen for further modelling (charts and tables),
  • description and characterization of variables included in the model (charts and tables),
  • model quality assessment -– all commonly applied criteria (in charts and tables),
  • reports summarizing building of the model:
  • gains table (cut-off analysis),
  • characteristic reports.

If agreement contains monitoring service then we deliver cyclically:

  • reports (listed below), for example as MS Excel files containing charts
  • short expert analysis of results and recommendations (e.g. removing characteristics from the model, calibration of the model, rebuilding of the model)

Reporting for monitoring purposes is consistent with best practices. It contains reports from the following groups:

  • acceptability and risk (if needed),
  • predictive power of scoring system,
  • predictove power od characteristics,
  • population stability.

Details, applied quality measures and criteria, frequency of reporting and technical solutions applied to monitoring — to be agreed depending on needs and requirements.


What methods do we apply? The methods those applying we teach and many more — depending on needs. Look through programs of the following trainings to familiarize yourselft with the methods we use:

Try again