Guiding Investments in India to Promote Gender Equality in Education

Simran Sawhney
5 min readApr 10, 2021

Gender inequalities exist across a stream of sectors and the education space is no exception. These disparities are particularly notable in India; according to the 2011 Census of India, 82% of males are literate, while only 65% of females are. While alarming, it becomes clear that, going forward, investments need to be made to bridge this gap — the question then becomes: where should benefactors invest in India to best achieve this goal? More specifically, what states should be targeted?

Data Collection

Data sources:

Collection methods:

I utilized a web scraping application extension to harvest the data from the sources above and converted the documents to CSV format. These files were then imported into R for analysis. For the choropleth exhibit, I downloaded GeoJSON data for state-specific geospatial structures in India and overlaid additional information.

Data Analysis

The first plots I created were intended to gauge male versus female participation in unpaid domestic work through a scatterplot format. While indispensable to any economy and its overall health, unpaid domestic work has often been seen outside the purview of economic activity and is stereotyped into a set of responsibilities typically performed by women with low education backgrounds. For definition purposes, unpaid domestic work has two main branches: (1) household maintenance and (2) care of people living in the household. Because the main question of this analysis is to identify key investment areas, there are a few takeaways from these exhibits: states with the highest female participation in unpaid domestic work are Mizoram (96.8%), Dadra & Nagar Haveli (95.1%), and Daman & Diu (94.8)%. However, states with the largest gender disparities in terms of time use are Gujarat (71.6%), Mizoram (69.5%), Rajasthan (68.8%), Andhra Pradesh (68.1%), and Madhya Pradesh (68.1%). From the second exhibit, it seems as though poverty rates positively correlate with female participation in unpaid domestic work more so than with gender participation disparities.

The next analysis I wanted to conduct dealt directly with poverty rates across different states. For this, I transformed the data into a choropleth map to visualize “hot spots” and region-wide trends. It seems as though poverty permeates Central, Northern, and Eastern regions most significantly in India.

Another feature I examined was the differences in male and female labor participation rates (per 1000 people) urban- and rural-wise, through lollipop charts. In the urban context, the data is less variable and differences are highest for Assam (476), Gujarat (472), and Dadra & Nagar Haveli (461). In the rural context, larger disparities exist (e.g., 660 for Daman & Diu and 520 for Chandigarh). While there is likely rationale behind this, it falls outside the scope of this analysis — what is an important consideration is how either, neither, or both contexts play into prioritizing specific states for education-related investments. I will revisit this topic further on in the discussion.

When deciding on which attributes to analyze, I took interest in sex ratio (per 1000 males) as a means to measure social attitudes towards women and, in tandem, the willingness of a community to invest in women’s education. While I cannot definitively point to a causal link through this barplot, it seems as though low sex ratios (618 for Daman & Diu and 774 for Dadra & Nagar Haveli) correlate with other aforementioned variables. And equally as important, states with high sex ratios (1084 for Kerala and 1037 for Puducherry) seem to be well-off with respect to pure averages across India.

The final and perhaps most important factor I considered was literacy rate differences in the form of a treemap. In my final investment recommendation, even though all analyses were looked at holistically, this attribute bore the most weight. On a regional basis, Northern India seems to suffer from the largest rifts. Broken down, however, on the state level, Rajasthan (27.85%), Jharkhand (22.24%), Chhattisgarh (20.86%), Dadra & Nagar Haveli (20.53%), and Madhya Pradesh (20.51%) suffer from the highest differences.

Conclusion

The final two exhibits I put together sum up the information collected above neatly. In the correlation matrix, the variables (v1 - v6) represent the following attributes:

  • v1 = male and female literacy rate differences
  • v2 = poverty rates
  • v3 = female participation rates in unpaid domestic work
  • v4 = male and female labor participation rate differences (urban)
  • v5 = male and female labor participation rate differences (rural)
  • v6 = sex ratios

It seems that the following interactions yield high positive correlations: urban and rural labor participation rate differences (0.6), literacy differences and poverty rates (0.5), poverty rates and female participation in unpaid domestic work (0.5), literacy differences and female participation in unpaid domestic work (0.4), and urban labor participation rate differences and sex ratios (0.4).

Using the assumption noted earlier, I will prioritize variables that have high positive correlations with literacy rate differences when interpreting this heatmap; more tangibly, ranking-wise (from highest to lowest priority): v1, v2, v3, v4, and v6. I will exclude v5 due to its negative correlation trend across the board.

With this list and chart in mind, I back the following investment recommendation:

Among the 35 states, India should prioritize Rajasthan, Jharkhand, Bihar, Chhattisgarh, Dadra & Nagar Haveli, and Daman & Diu when allocating its education budget.

--

--