Image Source: Getty
The Ministry of Defence (MoD) started 2025 by deeming it the ‘Year of Reforms’. This year, it has pledged its focus on emerging technologies, especially robotics, machine learning, and artificial intelligence (AI). The theme, of course, is an organic continuation of its 2024 theme, the ‘Year of Technology Absorption, Empowering the Soldier’. The usual perception is that the soldier needs only to be empowered on the battlefield and during combat; But that is not entirely true. Assisting the soldier in diverse non-battlefield use-cases - internal administration, allocation of business rules, logistics, command and brigade level procurement, personnel re-education and training, wargaming, disaster search and rescue, military doctrine and technology ethics - goes a long way in making the military more efficient. Artificial intelligence (AI) absorption has already begun within the Indian Armed Forces for non-battlefield use cases. But it cannot be merely users of AI; its ability to cultivate and enhance national AI capabilities must be exploited.
Artificial intelligence (AI) absorption has already begun within the Indian Armed Forces for non-battlefield use cases.
Since its establishment in 2018, the MoD’s Defence Artificial Intelligence Projects Agency (DAIPA), along with the Indian Armed Forces, created a pipeline of bespoke AI tools, such as chatbots, audiobots, and videobots, used as non-battlefield soldier virtual assistants, at par with militaries worldwide. In April 2024, the Indian Army Southern Command, as part of its technology absorption drive, built an AI chatbot called SAMADHAN that assists commissioned officers with queries related to public procurement and related policies. Another Indian Army-led chatbot, SAMBANDH, is built purposefully to assist war veterans and war widows in addressing, through one-on-one communications, their queries and grievances and disseminating relevant information amongst them. Similar to SAMADHAN and SAMBANDH, in 2023, the US-based information technology services company World Wide Technology Inc. introduced ‘SergeantAI’, an AI virtual assistant that helps soldiers, particularly non-commissioned officers, adhere to the US Army regulations. The China’s Joint Operations College, National Defence University has created a ‘Virtual AI Commander’, an AI avatar of a real-world Chinese commander, to assist soldiers at lower ranks with table-top simulations and virtual war games when the real-world commander is unavailable for simulations. The Chinese People’s Liberation Army (PLA) Academy of Military Sciences has also developed an AI chatbot called ChatBIT to assist soldiers and intelligence officials in support of their military intelligence operations and enhance their decision-making ability.
After the assimilation of AI virtual assistants in phase 1 of AI adoption, phase 2 involves the construction of bespoke defence large language models (LLMs). The US AI company Scale AI, in November 2024, announced the launch of Defense Llama, an LLM built on Meta’s Llama 3, that is customised to serve US national security missions, particularly carrying out intelligence operations, preparing for counter operations, and comprehending vulnerabilities of adversaries. The Defense Llama LLM is said to have been trained on an enormous dataset, including military doctrines, DoD policies, guidelines and Ethical Principles for Artificial Intelligence. The end-users of Defense Llama are the operators of command and control platforms, intelligence agencies, and decision-support systems. The Chinese PLA, now armed with the confidence of the indigenous DeepSeek V3 LLM’s success, may not want to depend on Meta or any other American foundational models for its military applications. Mandarin-language LLMs, built by Chinese AI companies, such as Yi, Qwen, Baichuan, XVERSE, are already performing well in Simplified Mandarin, Traditional Mandarin, Chinese ethnic minority languages like Kazakh, Jinghpo, Lhasa Tibetan, and a few other East Asian languages. The PLA could soon use its Sinic language superiority to create Sinic language defence LLMs for itself. Such Sinic language defence LLMs could be used for the enormous diversity of non-battlefield defence applications. Taking cues from this, the Indian MoD should consider being the benefactor of bespoke multilingual Indic-language defence LLMs.
The end-users of Defense Llama are the operators of command and control platforms, intelligence agencies, and decision-support systems.
Any AI language model is usually based on parameters such as ontologies, grammar, lexicons, and corpora, derived from databases of commonsensical facts written in a certain spoken and written language. Depending on the scale and size of the databases, the parameters in it are determined. As of January 2025, the current state-of-the-art LLMs have as many as 1.56 trillion parameters, of course, based on English language databases. However, India is in an advantageous position to have world-leading India-built Indic language LLMs. To attain such an advantageous position, it would be important to give such Indic LLMs enormous parameters from the richness of textual and audiovisual data in the possession of public sector entities. The MoD, being the largest employer in the country, could have its own Indic language defence LLM initiative given the enormous Indic and English language repositories it holds.
Although there has been some unwanted scepticism about India pursuing indigenously-built LLMs, there is considerable potential for Indian AI companies to dominate the niche of Indic languages. In the context of this article and its ideation, Indic languages include Hindi and other 21 official languages in the country. The Indian open-source AI startup Sarvam 2B’s Indic language small language model (SLM), although based on a modified Llama architecture, is performing much better in Indic languages than models from American foundational SLMs. So, if the MoD chooses to be the benefactor of Indic language LLMs—as closed-source models given the data sensitivity and network security—and offer its massive database, to set up parameters, it would not only end up building a closed-source defence LLM for its intramural usage, but also encourage a massive use-case for wider use of Indic language across other sectors. But how should the MoD do it?
The Indian open-source AI startup Sarvam 2B’s Indic language small language model (SLM), although based on a modified Llama architecture, is performing much better in Indic languages than models from American foundational SLMs.
The Indian military’s first major steps in emerging technology upskilling and capacity building, including those for AI, were announced in 2025, when the Indian Army stated its ongoing plans to induct emerging technology domain specialists mid-2025 onwards through the Army Education Corps, which was recently renamed the Army Knowledge & Enablers Corps. Domain specialists with postgraduate degrees will be inducted into the commissioned officer ranks, while experts with graduate degrees will be inducted at the junior commissioned officers’ ranks.
Within the MoD, it is the junior-commissioned and non-commissioned ranks, a sizable portion of the MoD non-military personnel, Agniveer, Territorial Army, and National Cadet Corps that generate and consume voluminous official content and data in the Indic language, particularly Hindi. This data consumption and generation aligns with the Official Language Act of 1963, and it is this voluminous content gathered over many decades and across all MoD and other Indian governmental institutions that could be digitised and used to set parameters for the closed-source Indic language defence LLMs. Of course, the English language data repository could be translated the way STEM academic textbooks are being translated to Hindi and other Indic languages under the AI Project Udaan carried out by IIT Bombay. This will further add to the enormity of parameters, a necessity for a richly fed LLM.
The Army Knowledge & Enablers Corps could create an ecosystem in partnership with DAIPA, DRDO, private AI companies, and civilian government laboratories.
The Army Knowledge & Enablers Corps could create an ecosystem in partnership with DAIPA, DRDO, private AI companies, and civilian government laboratories. If our soldiers are to fight time-space-force-information wars, if they have to be the human-in-loop of sophisticated systems, their acquaintance with the operations of AI systems is crucial. The AI system they work on must operate in the language they think in and speak. Furthermore, the Indic language defence LLM could also benefit the ‘Project Udbhav’ - the Indian Army’s program to synthesize ancient scripted wisdom on warfare, statecraft, historical missions, and ethics with modern-day military operations. Bringing such Sanskrit and other Indic language texts into the training and upskilling curriculum of the Army Training Command and College of Defence Management will have a tremendous impact on empowering the soldier. An Indic language LLM will be a milestone development towards the Indian government’s extant commitment to promoting Indic languages.
Jui Marathe is a Research Intern at the Observer Research Foundation
Chaitanya Giri is a Fellow with the Centre for Security, Strategy and Technology at the Observer Research Foundation
The views expressed above belong to the author(s). ORF research and analyses now available on Telegram! Click here to access our curated content — blogs, longforms and interviews.