Breaking

Friday 24 March 2017

Analytics on India census using Spark

POC#: Analytics on India census using Spark


In this article, I have explored Census data for India to understand changes in India’s demographics, population growth, religion distribution, gender distribution and sex ratio etc. Even by using small data, I could still gain a lot of valuable insights about the country. I have used Spark SQL and Inbuild graphs provided by Databricks.

India is the second most populous country in the world, with over 1.271 billion people, more than a sixth of the world's population. Already containing 17.5% of the world's population, India is projected to be the world's most populous country by 2025, surpassing China, its population reaching 1.6 billion by 2050.Its population growth rate is 1.2%.

We have loaded Census Data into Tables 

India’s States with Number of Districts.


India’s Population Density in terms of Districts.

Scheduled Castes (SC’s) Population per State.


Literacy Rate per States in India

States having Literacy Rate less than 50%

Gender wise Literacy rate per State


Education Type wise Literacy rate per State

Genders Ration per State

 Population by Religion per state

Drinking water Facility for Every State in India

Status of Electricity Facility per State

Education Facility per State

Medical Facility per State

Bus Transportation per State

Road Status per State

 Residence Status in India by State







8 comments:

  1. This is brilliant piece of analysis and can be applied to multiple programme run by Indian government .

    ReplyDelete
  2. Hi Bhavesh,
    Where is the input dataset used for this analysis?

    Please post your sparkSQL code to get this visualizaation..

    Thanks.

    ReplyDelete
  3. A lot of nice graphs. Is the data set available somewhere for download?
    I am asking because our product Querona allows building a logical data warehouse and it emulates SQL Server protocol, translating Transact-SQL to Spark SQL. I would like to migrate your charts and show them from Power BI.

    ReplyDelete
  4. Hi Bhavesh,

    it's a outstanding work.if possible could you please upload dataset and piece of code on git other will get learning exposure. Really appreciate

    ReplyDelete