Season of Commits 2025

rahulporuri · 25 June 2025 04:08

For the inaugural Season of Commits (2025), we have one project and one contributor

Project page - Welcome to PyDataStructs’s documentation! — PyDataStructs 0.0.1 documentation
Source code - GitHub - codezonediitj/pydatastructs: A python package for data structures and algorithms
Maintainer - @czgdp1807
Contributor - @prerak_singh
Stipend to contributor - 25,000 INR

Over a three-month duration (Jun-Aug), @prerak_singh will be working on PyDataStructs under the guidance of @czgdp1807. He has already started sharing weekly updates on a blog https://prex03.github.io/

How did this come about?

@czgdp1807 reached out to us, wondering if we could support @prerak_singh s contributions to the PyDataStructs project. He agreed to participate in our Season of Commits program, to receive support for @prerak_singh.

czgdp1807 · 25 June 2025 17:13

Thank you so much @rahulporuri for the opportunity.

@prerak_singh Keep posting your updates here. Looking forward to an exciting output.

prerak_singh · 26 June 2025 12:19

Thanks for the warm welcome and the opportunity to contribute to this wonderful program @rahulporuri.

Here are the blog posts detailing my progress in the first three weeks. I’ll keep adding weekly updates here for future blog posts as well.

Week 1 - Boosting PyDataStructs Graphs with C++ Backends and Benchmarking Tools | Prerak Singh
Week 2 - Optimizing PyDataStructs - BFS in C++, Benchmarking C++ Graphs, and Smarter Node Storage | Prerak Singh
Week 3 - Optimizing PyDataStructs - Prims in C++, Benchmarking algorithms, and Smarter Node & Edge Storage | Prerak Singh

prerak_singh · 30 September 2025 08:25

Hi everyone! Quick update on the progress of my project -

I’ve now completed two-thirds of my Season of Commits journey with FOSS United, working on accelerating the PyDataStructs library with C++ and LLVM backends. Here’s a week-wise recap of the progress so far:

Week 1: Implemented C++ backends for adjacency list and matrix graphs, along with a benchmarking framework to compare PyDataStructs and NetworkX on creation time and memory.
Week 2: Added BFS in C++, extended benchmarking, and optimized node storage using std::variant for faster, safer operations.
Week 3: Introduced std::variant-based edge storage and implemented Prim’s MST in C++ with improved performance and type safety.
Week 4: Added Dijkstra’s algorithm in the C++ backend, benchmarked it against NetworkX, and identified string-based node IDs as a key bottleneck for scaling.
Week 5: Enabled arbitrary Python objects in graph nodes and benchmarked Prim’s algorithm, achieving up to 2.3× faster performance than NetworkX on large graphs.
Week 6: Built the first LLVM backend (Bubble Sort) using llvmlite, laying the foundation for JIT compilation and runtime optimization in PyDataStructs.
Week 7: Optimized the LLVM Bubble Sort with vectorization and O3 passes, achieving 600× speedups over Python and outperforming the C++ backend.
Week 8: Extended the LLVM backend to adjacency list graphs, enabling LLVM-compiled graph operations with a Python API (working on macOS, with a small fix pending for Ubuntu).

Detailed blogposts can be found here for all weeks.

In the final phase of the project, I’ll be focusing on adding LLVM backends for graph algorithms and other data structures, while exploring advanced optimizations to push performance further.