@vectara
Vectara
2 years
1/12. @softwaredoug wrote a blog about #ChatGPT as a natural language programming paradigm. He posed some interesting questions about building #search systems with #LLMs . We interviewed our CTO @amin3141 to get his take on these.
1
3
13

Replies

@vectara
Vectara
2 years
2/12. Question 1: Imagine scaling up this system, but I can index millions of documents and it can run queries in sub 100ms time. Is that even possible?
1
1
4
@vectara
Vectara
2 years
3/12. Answer: There’s lots of interesting research that focuses on training #neural networks to directly serve queries on a corpus of documents by inputting query strings and outputting document ids. While it’s very costly, the gains in performance are also huge.
1
1
4
@vectara
Vectara
2 years
4/12. With the massive, ongoing investments into #ML hardware innovation, I expect such approaches to become commercially feasible within the next 2-3 years.
1
1
4
@vectara
Vectara
2 years
5/12. A more tractable approach is leveraging #LLMs to produce document #embeddings that capture #semantic information. With #embeddings , you can achieve a natural language query response latency of sub 100 ms.
1
1
4
@vectara
Vectara
2 years
6/12. #Embeddings are high dimensional #vectors that allow the interaction between a query and document to be reduced to a linear function, thereby allowing scaling to millions, even billions, of documents in a knowledge base.
1
1
4
@vectara
Vectara
2 years
7/12. Question 2: Can natural language programming systems be compiled into optimized machine code to achieve high speed inference at scale?
1
1
4
@vectara
Vectara
2 years
8/12. Answer: Compiling to optimized machine code is already happening: eg. for #PyTorch , the compiled, optimized version is TorchScript. But it goes much further than that: #neural networks today run on purpose-built hardware like TPU, Inferentia, Tensor Chip, NVidia A/H100, IPU
1
1
4
@vectara
Vectara
2 years
9/12. The biggest problem with #generative #LLMs isn't their speed, but a problem known in the research literature as hallucination. They can unpredictably fabricate information that sounds plausible, but is, in reality, false & unsupported.
1
1
5
@vectara
Vectara
2 years
10/12. While research into controlling or eliminating hallucination is very active, and I have no doubt that it will be solved at some point, it is a serious impediment to recognizing the potential of #generative #LLMs .
1
1
5
@vectara
Vectara
2 years
11/12. When hallucination is solved, #generative #neural systems are going to reshape the business world. They will be powered on the backend by extractive #neural platforms like Vectara, and will fully automate lots of knowledge-based jobs in every industry.
1
2
6
@vectara
Vectara
2 years
12/12. Unlike #generative systems, extractive #search platforms put customers in full control of the provenance of #search results which are always extracted, verbatim, from material explicitly indexed in the system.
0
3
9