Open Knowledge Nepal|
Thu Jun 20 2019
This blog post is written by Pragya Pradhan, an Open Data Women Fellow 2019
I am so glad to have stumbled upon Open Data Women Fellowship 2019 application on Twitter. On the last day of submission, I persisted through and applied. It was quite an overwhelming application but I’m glad that I applied. As I had decided to delve into the field of data science from digital marketing, I saw this opportunity as a way to figure out how data science is being used in Nepal. I would like to thank Open Knowledge Nepal for giving me a platform to gain more experience and meet interesting people in my field.
One of the reasons that I wanted to apply for the fellowship was the opportunity to visit various host organizations for half a day each and attend in-house training while understanding the organization’s mission and how it’s projects are promoting open data. We started our journey at the host companies on 15th April from Bikash Udyami and ended with Freedom Forum on 25th April 2019. Here is the timeline of our journey in the host companies:
Moreover, the one-month placement phase to match fellows with host organizations was the icing on the cake.
After the selection process, the ten fellows of the Open Data Women Fellowship went to ten host companies which comprised of software companies, INGOs, NGOs, and civic tech companies. After visiting all the host companies, we were asked our opinions on which host company would match our interests and organizing team would try to match it with the host companies. So after giving our views, all the 10 fellows met and were assigned placement to which they thought would be best for us. I was placed in the Institute of Integrated Development Studies (IIDS).
IIDS is a research institute which conducts studies on a wide range of socio-economic issues such as macroeconomics, foreign aid, agriculture and rural development, food and nutrition, population and health, education, trade and energy, regional cooperation, women in development, poverty alleviation, conflict resolution, and sustainable peace. Data analysis plays a major role while conducting research and analysis.
On my first day at IIDS, Mr. Prabin Dongol, the coordinator of the Fellowship program at IIDS, and data research unit here discussed his plans in store for me. In the first half of my first day, we discussed improving visibility of NepStat, a web-based data portal.
NepStat is a time-series data portal where data from all sectors of Nepal can be visualized, compared and downloaded. It is a fantastic hub for researchers, students and anyone who is looking for development data of Nepal. It can be accessed at nepstat.iids.org.np.
Although this data portal was launched last year, it has missed the limelight online. The reason being, it wasn’t optimized properly for search engines like Google. My first task was to prepare an SEO recommendation list for Nepstat and work with its technological partner. Also, on the very day, I joined as an experienced SEO expert, I gave a brief basic SEO session to two people in the data unit. Sharing what I know about search engine optimization was a good experience for me.
In the second half of my first day, Prabin sir gave a basic introduction to the Tableau dashboard. He showed awesome visualizations and explained how he did it. So the next task was to replicate the Tableau dashboard. In a nutshell, I got to know what I will be engaged in throughout the month.
So my first visualization dashboard was District-wise Labor Migration
The visualization shows the male and female population from each district of Nepal, migrating abroad for labor jobs. It has data from 2013 to 2017 including information on labor permits. You can compare male and female ratio migrating for jobs. The visualization speaks volumes on the trends of people heading overseas for jobs.
One surprising fact that I found out was the trend is actually declining in Nepal if compared with previous years. This data visualization will be useful for a researcher or even an entrepreneur who wants to invest in job opportunities in Nepal. This wonderful insight also led me to ponder how valuable Open Data can really be. If this data from the Department of Foreign Employment wasn’t openly available, creating this visualization wouldn’t have been possible at all. In hindsight, Open Data can be a boom for citizens, organizations, businesses and even government themselves. You can check out the final visualization at Nepal Migration Map.
Furthermore, my second task was to prepare visualization for the Human Development Index of Nepal by districts. Basically, I got opportunities to work on geospatial files and visualize different indicator of HDI. Check the visualization here:
Up to two weeks of my journey, I was preparing visualization from clean datasets given to me. Then I was given a task to clean some data. One of my first cleaning tasks was cleaning dataset for Export Market for Nepali goods. Checking the HS Code and rectifying description, removing the unwanted column and reducing the rows on the required filter. All the small details while cleaning play crucial roles in data visualization. I feel that 75% of work is done for visualization if the dataset is cleaned and as per requirement.
You can view all the cool interactive visualization at geo data page of Nep-stat.
I also got to prepare data from scratch from a PDF for the fiscal Budget of the provinces of Nepal 2018/2019. Data wrangling is a complicated process. Data are always raw and messed up. It takes a lot of time to start cleaning it manually. So, I looked for an automated process to do so.
With statistical tools like R, you can wrangle it pretty fast if you know how to do it. Data cleaning is a crucial part of data analysis which makes you think about what you want from the raw dataset. Prabin sir helped me through these challenges and taught me a few tips and tricks on cleaning data through R and Excel. It felt very satisfying to visualize data from the data I prepared from scratch.
Additionally, I also learned how to incorporate VBA with Excel to automate the simple repeated task of copy-pasting from different rows and columns. The monotonous task didn’t feel such when I tried a different approach. I confess that I learned it the hard way that some datasets are too dirty and need to be cleaned manually. But I’m thankful that I was given time to learn and utilize the skills on the tasks given to me.
Overall, I’m glad I could be of use for digital marketing and learn about the importance of data visualization for development sectors as well. I wish to continue learning more about data analysis and data preparation.
My career goal is to work with some combination of data science and digital marketing and making data-driven analysis for business and marketing.
With this fruitful experience in data analysis and visualization, I plan to move forward in my career in data science. I want to work and explore especially the field of data visualization. You can check out all the visualization that I created on Tableau over here.
For future generations, I want you to know that it is never too late to follow your passion. But don’t rely too much on it because it’s something that comes and goes randomly. Instead, rely on continuous work. It’s fine to take the time to figure out what you want to do. Try different things and learn from experience. And it is also important to keep reinventing yourself and don’t forget to have a little fun with it along the way.