A happy new year to you all!
I will first discuss our work over the past three months. Then, I will describe our goal for the next three months.
Q4 in Review
Ambuda has had a productive but curious quarter.
This quarter, we created Vidyut, a sophisticated Sanskrit processing toolkit. Vidyut is almost ready for use on Ambuda, and we look forward to showing you the results of our work soon. As a preview, here is a demo of our Paninian word generator — special thanks to Shreevatsa R. for preparing this code for use in a web browser.
In addition, Kishore has made great progress on simplifying our onboarding and development setup, which will greatly improve our ability to onboard new engineers. Special thanks to Ashwin for his assistance here as well.
Next, our proofing work continues to go well. Suhas has conducted multiple trials of paid proofing to see how that might accelerate our work. Our initial trials have been very promising, and we look forward to continuing that work in the new year.
Finally, we have clearer strategies in place for creating a legal framework for our project and pursuing official non-profit status. This process takes time, and I will share updates when the time is right. Thanks especially to Ashwin for his recommendations here.
At the same time, however, most of the work above was not on our Q3 roadmap. I did not expect to be writing Vidyut, and creating its components took more time than I had thought. So although we have made substantial progress, there’s a feeling of having missed the mark.
I’ve also learned from conversations with our community that it’s not always obvious what Ambuda’s top priorities are or how to track progress on our overall project.
I think I can do a better job of crisply stating our goals and making our priorities clear. To start, I would like to describe the core of Ambuda, which will always be our project’s top priority.
I think of Ambuda as a pipeline. This pipeline has three critical stages, where each stage flows into the next. These three stages are:
Transcribing. We find scanned Sanskrit books and convert them to high-quality text files. We do so with the help of OCR tools, manual proofing, and applications like our proofing tool.
Structuring. Once we have a text file, we convert it into a structured format by defining headers, sections, footnotes, variant readings, and so on. So far, we have done so manually, which is tedious and error-prone.
Analyzing. Once a plain text file has been structured, we must analyze it by undoing sandhi and analyzing words. So far, we have reused data from other projects. In the future, we can use Vidyut for this task.
With this simple model in mind, here are the challenges we must face at each stage:
Transcribing can subsist on volunteer effort alone. But to truly scale, we need money. In addition, there is still plenty of room to improve our proofing tools and remove more of the tedium required in proofing a text.
Structuring is tedious to do manually but tricky to get right with software, I think this is where Ambuda is weakest, and where we have the most room to improve.
Analysing needs tools that are good, fast, accessible, and easy to use. Few tools meet all four of these criteria, which is why we created Vidyut. Vidyut has made substantial progress, but there will always be room to improve it.
Whatever else Ambuda might do, this is our core. We must ensure that the pipeline flows.
Growing the pipeline
Given the model above, what is most important? How do we ensure that the pipeline flows?
There are dozens of answers to this question: better infrastructure, better onboarding, more fundraising, more publicity, more languages, a better legal framework, more partnerships, more users, more dictionaries, …
All of these are important, and I have mentioned many of them in our community already. But looking at them now, I think these answers distract from the core of the issue. They are incidental rather than essential.
Right now, what is essential is that Ambuda should grow.
Growth energizes the community and boosts morale. Growth brings new users, new contributors, and new donors. When we grow, our growing pains become obvious, and it’s clear what our priorities are. And when growth stalls, our priority should be to grow further.
Therefore, our goal for Q1 is simple: to grow. I suggest two simple metrics to track this, and these will be our sole goals for this quarter:
By the end of Q1, our library should grow, at minimum, at a rate of one text or translation per week. The goal here is sustainable, regular growth over time. (Current rate: 0 texts per week.)
By the end of Q1, our library should have, at minimum, 4000 monthly active users. (Currently: 1800.)
Tactically, this means that our top priority is to ensure that our pipeline of texts is flowing. And that means stronger support for onboarding our engineers and helping them be productive on our platform.
Our previous quarter was necessary so that we could build a stronger foundation for the project. Now is the time to build on that foundation and grow as well as we can.
Our work is just beginning. Thank you for supporting Ambuda. If you would like to support our work with a small donation, you can do so here.
1 January 2023
Appendix: Funds as of 2022-01-01
Since I am short on time, I will update this appendix by the end of the quarter. Briefly, technical spend is consistent with our last update. We are spending more money on proofing projects and have received more monetary support to continue our work.
Q3 in review
Over the past three months, more and more people have learned about Ambuda:
I’m utterly impressed with this project and the sheer quality of its execution. I’ve dreamed about a tool like this for over a decade and you’re leading this effort far better than I ever could.
The site is beautiful and very intuitive/user-friendly. The one-click de-sandhi-ing + the one-click parses/definitions are so slick. […] everything here is just so clean/smart."
This new Ambuda site is amazing ! Great job ! I look forward to see it grow. It is much better than any Skt site i have seen.
I don’t think it gets better than this — actually, it will get better than this, but I feel this is what it should all culminate into.
Our breakthrough library now contains 18 texts, 12 of which have clickable word meanings. Readers can look up these words in one of 9 integrated dictionaries that together support 4 different languages. Our site interface has become multilingual as well, and it now has strong support for Sanskrit, Marathi, and Telugu.
In addition, we built and launched a new proofing tool that has rapidly become the best and easiest way to convert scanned Sanskrit books into machine-readable text. Already, more than 25 contributors have already made more than 1200 edits through our system.
On the software side, we grew our technical platform to make it substantially others for others to contribute to our work, and we have been fortunate to receive technical contributions from 10 different people so far. In addition, we have found two promising Sanskrit parsing tools and are exploring new working relationships with researchers in India. I’ll say more about this work in a future update.
As for our team, Ambuda now has a group of around a dozen people who regularly contribute to project discussions and other work related to the project. More formally, our core team has grown to include four members: myself, Suhas Mahesh, Ashwin Ramaswami, and Kishore Chitrapu. I am also pleased to share that Bibek Debroy has agreed to be an advisor to Ambuda.
Finally, Ambuda has received more than ₹1,60,000 ($2,000) in donations, all of which will be vital as we continue to grow.
Overall, we have met almost all of our Q3 goals. The main exception, however, is that our library of 18 texts is much smaller than the 108 we had originally planned.
Why did we miss our goal here? There are three reasons:
To give people a meaningful way to contribute non-technically, I spent a substantial amount of time building and refining the proofing interface.
To help onboard others and grow the technical team, I spent a substantial amount of time on outreach, documentation, and testing.
Due to various life obligations, I had to take some time off from the project. During this time, technical contributors were unable to deploy any changes.
All of these reasons have the same the root cause: dependence on a single person.
Our advisors have told us that for Ambuda to succeed long-term, we must avoid dependence on a single person. So as we continue to the end of the year, I want to focus on building systems to reduce this weakness. When these systems are firmly in place, Ambuda will be poised for incredible growth.
Ambuda’s mission is to make the Sanskrit tradition radically accessible. Specifically, we have three goals:
- Create a complete archive of traditional Sanskrit literature
- Publish this archive in an open format.
- Integrate this archive with intelligent tools for students and scholars.
The theme of this quarter is systems. In a sentence, my goal is to build systems that let the entire project run even if I am unavaialble.
A complete archive
By the end of Q4, Ambuda will have:
- a simple user interface for adding new texts to the library.
- an end-to-end interface for converting PDFs into library texts.
- support for adding translations, commentaries, and audio to an existing text.
- support for displaying multiple editions of a given text.
The main roadblack to adding texts is that our process is highly manual and requires some technical skill. As much as possible, I want to eliminate the work required to add a new text to the library.
I likewise want to do the same for translations, commentaries, and audio, none of which we support currently. All three of these resources are tremendously useful and help us provide a much richer experience for our users.
Editions are important because different communities often prefer different editions of a text. Ambuda should serve the needs of all of these communities impartially.
Once these systems are in place, we can rapidly add all of these resources to Ambuda.
An open format
By the end of Q4, Ambuda will support downloading texts as XML files and PDFs.
Ambuda is committed to open data and open access. But the longer we go without making our resources easy to download, the more difficult it is to add that functionality later. At a minimum, Ambuda will let users download the raw XML files that we use to create our reader. In addition, we will also support simple PDF downloads so that it is easier to read these texts offline.
By the end of Q4, Ambuda will have:
- an online tool that parses arbitrary Sanskrit.
- an interface for creating and correcting parse data for Sanskrit texts.
At our current size, Ambuda’s distinguishing feature is its one-click word analysis. To extend this feature to every text on our library, we will continue to invest in our tooling here and build a simple interface for parsing texts and correcting parse data.
In addition to direct work on our mission, we will also make the Ambuda team stronger and more resilient.
By the end of Q4, Ambuda will have clear community guidelines that our contributors accept by consensus.
Our community is growing, and we need to broadly agree on how we should communicate with each other effectively. Our guidelines should be lightweight and welcoming while also setting clear rules that we can all accept and follow.
By the end of Q4, Ambuda will have a clear plan for being recognized as a non-profit organization.
Ambuda is a non-profit in spirit but not in law. Legal recognition has substantial benefits for the project:
We will receive substantial discounts on the technology we will need as we grow, including email and cloud services.
We will have a credible process for handling donations responsibly.
We will have a credible process for appointing a new project lead in case I am no longer able to serve in that capacity.
In short, the benefit is that we will become a much more credible group.
I am new to this process and don’t know how long it takes. So, my focus is on creating a strategy and timeline here.
By the end of Q4, Ambuda will accept international donations and have a much better user experience for requesting and receiving donations.
The easiest way to scale our proofing effort is to hire proofreaders directly. To do so effectively, we need to raise more money and make it easier for others to donate to Ambuda.
Our plans for this quarter focus heavily on internal concerns. But by making the bones of our project stronger, we will be well positioned for the future.
Our work is just beginning. Thank you for supporting Ambuda, and please contact us if you would like to help in any capacity.
1 October 2022
Appendix: Funds as of 2022-10-01
Our expenses so far have been low. But once we start scaling our proofing effort, our expenses will increase greatly. I’m planning to hold a 1 year reserve for website costs (around $250) and use the rest to hire proofreaders.
- $2264.44 (PayPal donations)
- $9.16 (
- $75.80 (
- $7.98 (
- $9.81 (DigitalOcean hosting, June 2022)
- $12.00 (DigitalOcean hosting, July 2022)
- $12.00 (DigitalOcean hosting, August 2022)
- $12.00 (DigitalOcean hosting, September 2022)
- $0.02 (AWS storage, August 2022)
- $0.02 (AWS storage, September 2022)
- $1.53 (Google Cloud API, September 2022)
- $0.33 (Plausible analytics, June 2022)
- $0.62 (Plausible analytics, July 2022)
- $1.02 (Plausible analytics, August 2022)
- $1.05 (Plausible analytics, September 2022)
Funds remaining: $2121.10
- Our Plausible plan is shared across multiple projects, including learnsanskrit.org. It is billed at $96.00 per year, or the equivalent of $8.00 per month, with a monthly quota of 100k views. Ambuda’s monthly page views are 4.1k (June), 7.7k (July), 12.7k (August), and 13.1k (September), which yield prorated costs of $0.33, $0.62, $1.02, and $1.05.
As part of our mission to make the Sanskrit tradition radically accessible, we have started to increase our site’s support for other languages. We currently support both Sanskrit and English, and you can choose between them from our site’s main page.
Please help us make Ambuda available in your language. You don’t need to know any Sanskrit to help us with our work.
There are three ways you can help.
1. Translate our interface to your language
If you want to help us translate our English interface to your language, please join our translation project on Transifex.
The language on the Transifex site is a little confusing, so just to clarify: translator accounts are free forever, and you don’t need to pay anything to sign up.
2. Help us find Sanskrit dictionaries in your language
We also want to add more Sanskrit dictionaries to our site. Please let us know if you have access to other dictionaries we can digitize and support.
So far, we have resources for the following language pairs:
- Sanskrit-Sanskrit (वाचस्पत्यम्, शब्दकल्पद्रुमः, various koshas)
- Sanskrit-Hindi (Apte)
- Sanskrit-Kannada (शब्दार्थकौस्तुभः)
- Sanskrit-Telugu (PDF of the सर्वशब्दसम्बोधिनी)
- Sanskrit-Marathi (hard copy of the गीर्वाणलघुकोश)
- Sanskrit-Gujarati (PDF of the शब्दरत्नमहोदधिः)
As well as:
- Sanskrit-English (various)
- Sanskrit-German (various)
- Sanskrit-French (various)
We are most interested in plain text files and PDF scans. If you have a book in hard copy, please consider using an app like vFlat Scan to digitize it.
3. Tell your friends
If you don’t have the time to help us directly, please help us find other volunteers by sharing this post with your friends and family. Every bit helps!