Analytics on India census using Spark

POC#: Analytics on India census using Spark

In this article, I have explored Census data for India to understand changes in India’s demographics, population growth, religion distribution, gender distribution and sex ratio etc. Even by using small data, I could still gain a lot of valuable insights about the country. I have used Spark SQL and Inbuild graphs provided by Databricks.

India is the second most populous country in the world, with over 1.271 billion people, more than a sixth of the world's population. Already containing 17.5% of the world's population, India is projected to be the world's most populous country by 2025, surpassing China, its population reaching 1.6 billion by 2050.Its population growth rate is 1.2%.

We have loaded Census Data into Tables

India’s States with Number of Districts.

India’s Population Density in terms of Districts.

Scheduled Castes (SC’s) Population per State.

Literacy Rate per States in India

States having Literacy Rate less than 50%

Gender wise Literacy rate per State

Education Type wise Literacy rate per State

Genders Ration per State

Population by Religion per state

Drinking water Facility for Every State in India

Status of Electricity Facility per State

Education Facility per State

Medical Facility per State

Bus Transportation per State

Road Status per State

Residence Status in India by State

8 comments:

Sachin Hingmire24 March 2017 at 22:05
Outstanding work!
ReplyDelete
Replies
Unknown25 March 2017 at 07:13
This is brilliant piece of analysis and can be applied to multiple programme run by Indian government .
ReplyDelete
Replies
Bhavesh25 March 2017 at 07:59
Thanks a lot waseem
ReplyDelete
Replies
Unknown25 March 2017 at 22:24
Hi Bhavesh,
Where is the input dataset used for this analysis?

Please post your sparkSQL code to get this visualizaation..

Thanks.
ReplyDelete
Replies
The ideas about data30 March 2017 at 06:16
A lot of nice graphs. Is the data set available somewhere for download?
I am asking because our product Querona allows building a logical data warehouse and it emulates SQL Server protocol, translating Transact-SQL to Spark SQL. I would like to migrate your charts and show them from Power BI.
ReplyDelete
Replies
ShiNing-YS25 July 2017 at 01:09
very good
ReplyDelete
Replies
Unknown27 July 2017 at 00:20
Hi Bhavesh,

it's a outstanding work.if possible could you please upload dataset and piece of code on git other will get learning exposure. Really appreciate
ReplyDelete
Replies