Plans for 2026

3 months ago

After a long break, Ambuda is resuming regular activity.

Our library now contains more than 200 texts, including several rare Upanishads, several plays by Bhasa, and a variety of texts related to Advaita Vedanta. To adjust to our larger library, our site has received some visual updates to make it easier to find and explore a given text. These include a search bar on the front page, a catalog view for exploring the library as a whole, and new support for bulk downloads and exports.

Our proofing engine has received several major upgrades, including a new visual editor and better error-checking capabilities. We have also experimented with releasing “unproofed” texts with minimal human correction. Advances in optical character recognition (OCR) have helped enormously in making this work possible.

To reach a wider audience and simplify our operations, most project discussion is now on a public mailing list (https://groups.google.com/g/ambuda-discuss). Please join if you are interested in our work!

Our main goal this year is to keep up the momentum and publish as many texts as we can. There are three bottlenecks for doing so.

The first bottleneck is time. Most of our new texts were proofread by me. While I enjoy proofreading, I have limited time and need to grow our proofing capacity so that it is easier for others to contribute and help.

The second bottleneck is scanned documents. While plenty of scanned documents exist on archive.org and other sites, the trick is to find a high-quality scan that given our automated systems the best chance to succeed.

The third bottleneck is money, both for hiring assistants and for paying for services like OCR

I intend to aggressively pursue becoming a registered nonprofit so that we can raise the funds necessary to remove these bottlenecks.

While there is more I could say, actions speak louder than words.

Now it is time to get to work. Thank you for your support, and please join the ambuda-discuss mailing list if you want to follow along.

Arun

Ambuda Q2 Plan

3 years ago

Despite the normal up and downs of daily life, Ambuda has continued to make steady progress toward its goal. This post summarizes our work in 2023 and our plans for the next three months.

Q1 in Review

Our main progress this quarter is that we’ve proofread a few dozen new texts and translations, many of which will soon find their way onto our library. The full list is:

Shivatandavastotra + commentary
Dakshinamurty Stotra + commentary
Vayu Stuti + commentary
Tripuravijayachampu
Meghaduta with Vidyullata
Nataraja Stuti + commentary
Shivamahimna stotra
Damodaramadhavastotra
Meghaduta (Kannada translation)
Suryashataka
Somanathashataka
Vivekacudamani + vyakhya
Madhuravijaya + translation
Anyoktimuktalata
Dashavatarastotra + commentary
Kavyashiksha
Kadambarisangraha
Political Concepts in Ancient India
Aryasaptashati
Madhva Vijaya (Kannada translation)
Vivekacudamani (Kannada translation)
Yoga-Sankhya-Kosha
Sangraha-Ramayana (Kannada translation)
Chandishataka + commentary
Shankara Vedanta Kosha (headwords)
Kushakumudvatiya Nataka

Our user base has likewise grown from 1800 monthly users to 3000 monthly users. While short of our goal of 4000, this is still a 66% increase with no explicit outreach on our part.

On the technical side, our Vidyut engine continues to progress, but we need a little more time to incorporate it into our website. For our Paninian derivation engine, our major progress includes:

Substantial progress on prefixed verbs, including support for words like sañcaskāra, samudyāt, and various retroflexions of n and s after certain prefixes.
Substantial progress on sanādi-dhātus. Our last major edge case is on verbs where the abhyāsa is elided (mitsati, lipsati, etc.), and once these are complete, we will be ready to incorporate this library into Ambuda.
Substantial progress on subantas (nominals), including sarvanāmas (pronouns).
Substantial progress on kṛt (verbal) suffixes, including uṇādi suffixes.
Partial support for taddhita (nominal) suffixes.
Explicit tests for more than 600 Ashtadhyayi sutras based on the Kashika Vrtti.

One Year of Ambuda

Our top priority is still to grow our end-to-end pipeline for Sanskrit texts. For details, see the notes in our Q1 post from a few months ago.

Therefore, our goals for the end of Q2 are as follows:

Our library will have a total of 50 primary texts on our website. (Current: 18.)
Our library will have at least 10 translations and commentaries. (Current: 0.)
Our library will have, at minimum, 5000 monthly active users. (Currently: 3000.)
Our Vidyut work will be fully integrated into Ambuda, both for our dictionary and for our proofing work. (Currently: no integration.)

Closing Thoughts

We started work on Ambuda on 1 June 2022, which means that Ambuda is almost a year old. Although our project has come a long way, we still have a long way to go, and I am happy to see our investments in our core technology start to bear fruit.

Our work is just beginning. Thank you for supporting Ambuda.

Arun Prasad

1 April 2023

Appendix: Funds as of 2023-04-01

We are currently running a deficit of a few thousand dollars, which is workable for now but not sustainable. If you would like to support our work, please give us the gift of a donation so that we can continue. 100% of the money we receive goes to proofreaders in India and toward basic operations costs, such as web hosting.

Ambuda Q1 Plan

3 years ago

A happy new year to you all!

I will first discuss our work over the past three months. Then, I will describe our goal for the next three months.

Q4 in Review

Ambuda has had a productive but curious quarter.

This quarter, we created Vidyut, a sophisticated Sanskrit processing toolkit. Vidyut is almost ready for use on Ambuda, and we look forward to showing you the results of our work soon. As a preview, here is a demo of our Paninian word generator — special thanks to Shreevatsa R. for preparing this code for use in a web browser.

In addition, Kishore has made great progress on simplifying our onboarding and development setup, which will greatly improve our ability to onboard new engineers. Special thanks to Ashwin for his assistance here as well.

Next, our proofing work continues to go well. Suhas has conducted multiple trials of paid proofing to see how that might accelerate our work. Our initial trials have been very promising, and we look forward to continuing that work in the new year.

Finally, we have clearer strategies in place for creating a legal framework for our project and pursuing official non-profit status. This process takes time, and I will share updates when the time is right. Thanks especially to Ashwin for his recommendations here.

At the same time, however, most of the work above was not on our Q3 roadmap. I did not expect to be writing Vidyut, and creating its components took more time than I had thought. So although we have made substantial progress, there’s a feeling of having missed the mark.

I’ve also learned from conversations with our community that it’s not always obvious what Ambuda’s top priorities are or how to track progress on our overall project.

I think I can do a better job of crisply stating our goals and making our priorities clear. To start, I would like to describe the core of Ambuda, which will always be our project’s top priority.

The pipeline

I think of Ambuda as a pipeline. This pipeline has three critical stages, where each stage flows into the next. These three stages are:

Transcribing. We find scanned Sanskrit books and convert them to high-quality text files. We do so with the help of OCR tools, manual proofing, and applications like our proofing tool.
Structuring. Once we have a text file, we convert it into a structured format by defining headers, sections, footnotes, variant readings, and so on. So far, we have done so manually, which is tedious and error-prone.
Analyzing. Once a plain text file has been structured, we must analyze it by undoing sandhi and analyzing words. So far, we have reused data from other projects. In the future, we can use Vidyut for this task.

With this simple model in mind, here are the challenges we must face at each stage:

Transcribing can subsist on volunteer effort alone. But to truly scale, we need money. In addition, there is still plenty of room to improve our proofing tools and remove more of the tedium required in proofing a text.
Structuring is tedious to do manually but tricky to get right with software, I think this is where Ambuda is weakest, and where we have the most room to improve.
Analysing needs tools that are good, fast, accessible, and easy to use. Few tools meet all four of these criteria, which is why we created Vidyut. Vidyut has made substantial progress, but there will always be room to improve it.

Whatever else Ambuda might do, this is our core. We must ensure that the pipeline flows.

Growing the pipeline

Given the model above, what is most important? How do we ensure that the pipeline flows?

There are dozens of answers to this question: better infrastructure, better onboarding, more fundraising, more publicity, more languages, a better legal framework, more partnerships, more users, more dictionaries, …

All of these are important, and I have mentioned many of them in our community already. But looking at them now, I think these answers distract from the core of the issue. They are incidental rather than essential.

Right now, what is essential is that Ambuda should grow.

Growth energizes the community and boosts morale. Growth brings new users, new contributors, and new donors. When we grow, our growing pains become obvious, and it’s clear what our priorities are. And when growth stalls, our priority should be to grow further.

Therefore, our goal for Q1 is simple: to grow. I suggest two simple metrics to track this, and these will be our sole goals for this quarter:

By the end of Q1, our library should grow, at minimum, at a rate of one text or translation per week. The goal here is sustainable, regular growth over time. (Current rate: 0 texts per week.)
By the end of Q1, our library should have, at minimum, 4000 monthly active users. (Currently: 1800.)

Tactically, this means that our top priority is to ensure that our pipeline of texts is flowing. And that means stronger support for onboarding our engineers and helping them be productive on our platform.

Closing thoughts

Our previous quarter was necessary so that we could build a stronger foundation for the project. Now is the time to build on that foundation and grow as well as we can.

Our work is just beginning. Thank you for supporting Ambuda. If you would like to support our work with a small donation, you can do so here.

Arun Prasad

1 January 2023

Appendix: Funds as of 2023-01-01

Since I am short on time, I will update this appendix by the end of the quarter. Briefly, technical spend is consistent with our last update. We are spending more money on proofing projects and have received more monetary support to continue our work.

Ambuba Q4 Plan

3 years ago

Q3 in review

Over the past three months, more and more people have learned about Ambuda:

I’m utterly impressed with this project and the sheer quality of its execution. I’ve dreamed about a tool like this for over a decade and you’re leading this effort far better than I ever could.

The site is beautiful and very intuitive/user-friendly. The one-click de-sandhi-ing + the one-click parses/definitions are so slick. […] everything here is just so clean/smart."

This new Ambuda site is amazing ! Great job ! I look forward to see it grow. It is much better than any Skt site i have seen.

I don’t think it gets better than this — actually, it will get better than this, but I feel this is what it should all culminate into.

Our breakthrough library now contains 18 texts, 12 of which have clickable word meanings. Readers can look up these words in one of 9 integrated dictionaries that together support 4 different languages. Our site interface has become multilingual as well, and it now has strong support for Sanskrit, Marathi, and Telugu.

In addition, we built and launched a new proofing tool that has rapidly become the best and easiest way to convert scanned Sanskrit books into machine-readable text. Already, more than 25 contributors have already made more than 1200 edits through our system.

On the software side, we grew our technical platform to make it substantially others for others to contribute to our work, and we have been fortunate to receive technical contributions from 10 different people so far. In addition, we have found two promising Sanskrit parsing tools and are exploring new working relationships with researchers in India. I’ll say more about this work in a future update.

As for our team, Ambuda now has a group of around a dozen people who regularly contribute to project discussions and other work related to the project. More formally, our core team has grown to include four members: myself, Suhas Mahesh, Ashwin Ramaswami, and Kishore Chitrapu. I am also pleased to share that Bibek Debroy has agreed to be an advisor to Ambuda.

Finally, Ambuda has received more than ₹1,60,000 ($2,000) in donations, all of which will be vital as we continue to grow.

Overall, we have met almost all of our Q3 goals. The main exception, however, is that our library of 18 texts is much smaller than the 108 we had originally planned.

Why did we miss our goal here? There are three reasons:

To give people a meaningful way to contribute non-technically, I spent a substantial amount of time building and refining the proofing interface.
To help onboard others and grow the technical team, I spent a substantial amount of time on outreach, documentation, and testing.
Due to various life obligations, I had to take some time off from the project. During this time, technical contributors were unable to deploy any changes.

All of these reasons have the same the root cause: dependence on a single person.

Our advisors have told us that for Ambuda to succeed long-term, we must avoid dependence on a single person. So as we continue to the end of the year, I want to focus on building systems to reduce this weakness. When these systems are firmly in place, Ambuda will be poised for incredible growth.

Our mission

Ambuda’s mission is to make the Sanskrit tradition radically accessible. Specifically, we have three goals:

Create a complete archive of traditional Sanskrit literature
Publish this archive in an open format.
Integrate this archive with intelligent tools for students and scholars.

The theme of this quarter is systems. In a sentence, my goal is to build systems that let the entire project run even if I am unavaialble.

A complete archive

By the end of Q4, Ambuda will have:

a simple user interface for adding new texts to the library.
an end-to-end interface for converting PDFs into library texts.
support for adding translations, commentaries, and audio to an existing text.
support for displaying multiple editions of a given text.

The main roadblack to adding texts is that our process is highly manual and requires some technical skill. As much as possible, I want to eliminate the work required to add a new text to the library.

I likewise want to do the same for translations, commentaries, and audio, none of which we support currently. All three of these resources are tremendously useful and help us provide a much richer experience for our users.

Editions are important because different communities often prefer different editions of a text. Ambuda should serve the needs of all of these communities impartially.

Once these systems are in place, we can rapidly add all of these resources to Ambuda.

An open format

By the end of Q4, Ambuda will support downloading texts as XML files and PDFs.

Ambuda is committed to open data and open access. But the longer we go without making our resources easy to download, the more difficult it is to add that functionality later. At a minimum, Ambuda will let users download the raw XML files that we use to create our reader. In addition, we will also support simple PDF downloads so that it is easier to read these texts offline.

Intelligent tools

By the end of Q4, Ambuda will have:

an online tool that parses arbitrary Sanskrit.
an interface for creating and correcting parse data for Sanskrit texts.

At our current size, Ambuda’s distinguishing feature is its one-click word analysis. To extend this feature to every text on our library, we will continue to invest in our tooling here and build a simple interface for parsing texts and correcting parse data.

Our team

In addition to direct work on our mission, we will also make the Ambuda team stronger and more resilient.

Community

By the end of Q4, Ambuda will have clear community guidelines that our contributors accept by consensus.

Our community is growing, and we need to broadly agree on how we should communicate with each other effectively. Our guidelines should be lightweight and welcoming while also setting clear rules that we can all accept and follow.

Legal

By the end of Q4, Ambuda will have a clear plan for being recognized as a non-profit organization.

Ambuda is a non-profit in spirit but not in law. Legal recognition has substantial benefits for the project:

We will receive substantial discounts on the technology we will need as we grow, including email and cloud services.
We will have a credible process for handling donations responsibly.
We will have a credible process for appointing a new project lead in case I am no longer able to serve in that capacity.

In short, the benefit is that we will become a much more credible group.

I am new to this process and don’t know how long it takes. So, my focus is on creating a strategy and timeline here.

Donations

By the end of Q4, Ambuda will accept international donations and have a much better user experience for requesting and receiving donations.

The easiest way to scale our proofing effort is to hire proofreaders directly. To do so effectively, we need to raise more money and make it easier for others to donate to Ambuda.

Closing thoughts

Our plans for this quarter focus heavily on internal concerns. But by making the bones of our project stronger, we will be well positioned for the future.

Our work is just beginning. Thank you for supporting Ambuda, and please contact us if you would like to help in any capacity.

Arun Prasad

1 October 2022

Appendix: Funds as of 2022-10-01

Our expenses so far have been low. But once we start scaling our proofing effort, our expenses will increase greatly. I’m planning to hold a 1 year reserve for website costs (around $250) and use the rest to hire proofreaders.

Income:

$2264.44 (PayPal donations)

Costs:

$9.16 (ambuda.org 1-year registration)
$75.80 (ambuda.org 5-year renewal)
$7.98 (ambuda.in 1-year registration)
$9.81 (DigitalOcean hosting, June 2022)
$12.00 (DigitalOcean hosting, July 2022)
$12.00 (DigitalOcean hosting, August 2022)
$12.00 (DigitalOcean hosting, September 2022)
$0.02 (AWS storage, August 2022)
$0.02 (AWS storage, September 2022)
$1.53 (Google Cloud API, September 2022)
$0.33 (Plausible analytics, June 2022)
$0.62 (Plausible analytics, July 2022)
$1.02 (Plausible analytics, August 2022)
$1.05 (Plausible analytics, September 2022)

Funds remaining: $2121.10

Notes:

Our Plausible plan is shared across multiple projects, including learnsanskrit.org. It is billed at $96.00 per year, or the equivalent of $8.00 per month, with a monthly quota of 100k views. Ambuda’s monthly page views are 4.1k (June), 7.7k (July), 12.7k (August), and 13.1k (September), which yield prorated costs of $0.33, $0.62, $1.02, and $1.05.

Help us make Ambuda available in your language!

3 years ago

As part of our mission to make the Sanskrit tradition radically accessible, we have started to increase our site’s support for other languages. We currently support both Sanskrit and English, and you can choose between them from our site’s main page.

Please help us make Ambuda available in your language. You don’t need to know any Sanskrit to help us with our work.

There are three ways you can help.

1. Translate our interface to your language

If you want to help us translate our English interface to your language, please join our translation project on Transifex.

The language on the Transifex site is a little confusing, so just to clarify: translator accounts are free forever, and you don’t need to pay anything to sign up.

2. Help us find Sanskrit dictionaries in your language

We also want to add more Sanskrit dictionaries to our site. Please let us know if you have access to other dictionaries we can digitize and support.

So far, we have resources for the following language pairs:

Sanskrit-Sanskrit (वाचस्पत्यम्, शब्दकल्पद्रुमः, various koshas)
Sanskrit-Hindi (Apte)
Sanskrit-Kannada (शब्दार्थकौस्तुभः)
Sanskrit-Telugu (PDF of the सर्वशब्दसम्बोधिनी)
Sanskrit-Marathi (hard copy of the गीर्वाणलघुकोश)
Sanskrit-Gujarati (PDF of the शब्दरत्नमहोदधिः)

As well as:

Sanskrit-English (various)
Sanskrit-German (various)
Sanskrit-French (various)

We are most interested in plain text files and PDF scans. If you have a book in hard copy, please consider using an app like vFlat Scan to digitize it.

3. Tell your friends

If you don’t have the time to help us directly, please help us find other volunteers by sharing this post with your friends and family. Every bit helps!

If you have any questions or comments, please contact us or join us on our Discord server on the #i18n channel. Thank you.