fbpx
🎉 Stay safe and learn from the convenience of your home. All classes are conducted virtually. 🧑‍💻

The Data Analysis Checklist that Every Beginner Needs

by | Sep 30, 2022 | Blogs | 0 comments

data analysis checklist - Vertical Institute

Before we dive into the specifics of data analytics, let’s revisit the concept of it. Data analytics involves the collection and cleaning of raw data in order to extract information and generate valuable insights that help companies make informed data-driven decisions. 

Now that you get the idea of data analytics, do you know where to get started? How to start analysis of data exactly? If you have no clue – fret not. Find out as we cover the data analysis reference checklist that every data practitioner, beginner or professional, needs.

What is this checklist?

As the saying by Benjamin Franklin goes, “If you fail to plan, you plan to fail’. Planning is an essential component when embarking on projects, and a data analytics project is no exception. 

The checklist we are introducing is actually a widely adopted project development framework for data analytics, which is known as the CRoss-Industry Standard Process for Data Mining (CRISP-DM).

Introduction to CRISP-DM

CRISP-DM framework is mainly adopted due to its structured approach when planning a project. Data practitioners adhere to a systematic process that ultimately results in a feasible analytics solution that meets the company’s needs. The solution is basically the product of a well-rounded project that is clearly understood, planned, executed and documented.

CRISP-DM contains six phases, which would be touched on next.

Phase 1: Business Understanding

Before rushing into analysing the data, take a step back and come up with a list of relevant questions first.

  • What is the business problem this project is trying to solve?
  • Is there sufficient resources (data quality, manpower, etc.) to address the business problem at hand?
  • What are some risks and constraints of the project?
  • What is the business and data mining objective?
  • Who are the relevant stakeholders and how does it help them?

Phase 2: Data Understanding

This phase mainly revolves around collecting, exploring and verifying data quality. Data quality is crucial especially when choosing your data sources, as it has an impact on the analytics solution.

Although data analysis has its benefits of being able to provide powerful analytical tools and can be applied across many industries, it has its fair share of limitations. A limitation of data analysis is that the quality of data mining results and applications depends on the availability and quality of the data. Thus, these are some questions to consider:

  • Is the dataset large enough in terms of data size to discover patterns and trends? 
  • Are the data types correct? Do the logical values make sense and are scaled correctly?
  • Are there any missing data in the dataset? 

Phase 3: Data Preparation

During this phase, the team begins preparing data for analysis by selecting and cleaning them.

  • Which fields are relevant and irrelevant for data modelling?
  • How do we fix missing and erroneous data? Do we remove them or replace them with meaningful values from descriptive statistics? (Mean, Median, Mode)
  • Are there any outliers that needs to be removed?
  • Can we derive any new fields and is there a need to encode them?

Phase 4: Modelling

A modelling technique is chosen, which is dependent on the nature of the business problem. If the target variable is of a quantitative nature, quantitative data analysis techniques such as linear and logistic regression are used.

On the contrary, if the target variable is of a qualitative nature, qualitative data analysis techniques like association and clustering are used. Additionally, there is a method that allows you to process both, which is known as decision trees.

  • Which is the most appropriate machine learning algorithm we should use?
  • Which model has the highest accuracy?

Phase 5: Evaluation

In this phase, the results from the model are compared against the data mining objectives set in Phase 1. The project team can review the entire process collectively, such as looking into areas of improvement for the champion model and whether any tasks were overlooked. The project manager has the final say as to whether the project can proceed to the next phase of deployment.

  • Are there any further improvements that can be made to the existing model?
  • Were any of the important tasks or issues overlooked?
  • Are there any other outstanding issues that needs to be addressed before the model is ready for deployment?

Phase 6: Deployment

The analytics solution is then officially deployed. After deployment, the model is to be closely monitored and maintained. A report containing each and every finding needs to be written to conclude the project.

How to get started on your own project?

Now that you have grasped the basic concepts of data analytics, a data analysis checklist and the CRISP-DM framework, why not take your knowledge one step further by signing up for Vertical Institute’s beginner-friendly Data Analytics Course led by industry experts? 

The Data Analytics Courses covers the fundamentals of data analytics, where you get to learn and become proficient in industry-relevant tools such as Excel, SQL and Tableau. You also get to work on a capstone project that tackles real-world problems and gain a professional certification upon completion. 

Kickstart your data analytics journey today!

About Vertical Institute

Vertical Institute prepares individuals for the jobs of tomorrow. We specialise in teaching in-demand skills, building the next generation of changemakers and inventors through our world-class tech courses and certifications. 

Singaporeans and PRs can receive up to 90% IBF Funding off their course fees with Vertical Institute. The remaining fees can be claimable with SkillsFuture Credits or NTUC UTAP Funding.

Recent Posts

5 Tips to Build your Content Marketing Strategy

Whether you’re starting out in content marketing or seeking to optimise your existing content, a content marketing strategy is crucial. However, it may be challenging to develop a strategy that works well with your overall marketing plan and your business. So, here...

An Introduction To Key Performance Indicators

KPI is a term that is often heard and used in business meetings or conversations. But, how can you define KPI in marketing? A key performance indicator (KPI) is a measurable characteristic that shows how well any company is achieving its key business goals. To put it...

All you Need to Know About Organic Social Media

Organic social media is excellent for attracting attention without having to pay for the advertisement space. The main way to do this is through organic content\ social media techniques. This includes blogging, scheduling social media posts, and sharing other people's...

All You Need to Know About Marketing Personas

Marketing personas can be said to be one of the first priorities of marketers for a long time. They are like the building blocks of any successful marketing campaigns. Crafting personas in marketing provide us with a clear picture of the preferences and wants of our...

Everything You Need to Know About Display Advertising

As digital advertising spendings are expected to further increase worldwide, it is not surprising that display advertising accounted for the largest ad spend share in 2021 at 244 billion USD, according to Statista. In this post, we will go over all you need to know...