Introducing Aggregation
Almost every college student who loves coding and has tasted development dreams of being selected for the Google Summer of Code (GSoC) program. I am grateful that I got an opportunity to work with signac via the GSoC program. In this post, I will describe my introduction to the signac team and an overview of the aggregation project.
About Me!
I am a student from the Indian Institute of Technology Roorkee pursuing an undergraduate degree of Bachelor of Technology in Polymer Science and Engineering. My interest lies in Data Science and the field of Artificial Intelligence. In the past, I have developed a few Android Applications and have tried my hands on Web Development. Now I am working as an intern for signac via the GSoC program.
Why signac?
Package Development was a topic which I was curious to know more about and the wonderful signac team provided me with the opportunity to learn. Before the official date for the submission of proposal for the GSoC project, several meetings were held for the people who wanted to learn about the code base of signac-flow, development of packages, continuous integration, etc. I found the meetings extremely helpful and hence was pretty sure that I came to the right place.
Did I have any prior work related experience of package development or git?
To be honest, no! I will always remember the first issue that was assigned to me. It was becomming really hard for me to figure out everything by myself. But this is where the people at signac helped me understand the concepts of how things work in the world of Open-Source! After that, things went pretty smoothly, all thanks to the team.
The Project: Aggregation of Workflows
Every operation in signac currently takes in only a single job as a parameter. The goal of my project is to enable aggregation of jobs over an operation. signac is frequently used for multi-dimensional parameter sweeps where analyses typically involve grouping jobs according to common state point parameters, for instance to make plots that average over replicas. As a result, operating on multiple jobs at once is a highly desirable feature.
The Community Bonding Period
My goals for the community bonding period were to get to know my mentors well, create a great bond with the community, and learn more about signac. There were several parts about my project that I was confused about and I figured out many of them in this period. My main questions were:
- Why do we need signac?
- How do operations actually get executed?
- How are operations registered?
- How will I merge the new aggregation concept with the recently introduced
FlowGroup
feature? - How will I submit aggregate operations?
I have now answered most of these questions and have started working actively on my project with a positive attitude.
Getting started with signac-flow
The signac-flow package is comprised of many classes which did not immediately make sense to me. Even after taking up a few small issues and fixing them with some help, I wasn’t able to understand how flow actually works. To understand how flow works, I think you have to become a user first. After figuring that out I tried adding print statements (or breakpoints), studied the code base, and played with signac. This strategy worked well for me, and may help other new developers. You can start with the examples mentioned in the official documentation of signac or create scripts on your own.
Connecting with the signac team
For real-time user support you can join signac’s gitter chat room or join the project’s primary communication channel for code development, the Slack workspace.
Thank you for sticking with me so far. At last, I’d like to thank my parents, the signac team and my friends for helping me achieve this. I will be back with my work updates in my next blog.