Interrogation

Description of interrogation for Iris. What is interrogation and how can it be used ?

How does it work ?

Interrogation uses the RAG (Retrieved-Augmented Generation) to perform its queries. This method allows to reduce drastically the number of hallucinations since the propositions are made on verified sources (the documents) and not on the memory of the model.

The pipeline is the following

  1. Input Processing:

    • The system receives a text input, like a question or a prompt.

  2. Retrieval Phase:

    • The model searches in the documents list (a single document, a set of documents or a string list) to find relevant information.

  3. Relevance Determination:

    • The system evaluates the retrieved documents to determine their relevance to the input query.

    • Only the most pertinent paragraphs (namely chunks) are selected for the next stage.

  4. Generation Phase:

    • The language model then uses the relevant chunks to generate a response and return it.

What are the current limitations ?

  • The main current limitation is the ability to use the totality of the document as input. Only the two most important chunks are used to answer the query. So if the query requires the totality of the document, the answer may be poor.

What will not work properly

  • Can you summarize the document ? For this request, see Summarization

  • What is the subject of the document ? For this request, see Summarization

What will work

  • What is the surface of the second floor ?

  • What is the price of the project ?

API

Interrogate a list of documents

The document must have been indexed

import requests
import json

token = 'JWT ' + ''  # set your token here
url = "https://iris.egis-group.com/api/cgpt_structure/task_execute/?label_task=question_document_id&document_id_list=1"

payload = json.dumps({
  "values_list": [
    {
      "label": "question_1",
      "val": "Quelle est la question 1 ?"
    },
    {
      "label": "question_2",
      "val": "Quelle est la question 2 ?"
    }
  ]
})
headers = {
  'Authorization': token,
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

Interrogate a list of strings

This string list may no have been indexed

import requests
import json

token = 'JWT ' + ''  # set your token here
url = "https://iris.egis-group.com/api/cgpt_structure/task_execute/?label_task=string_simple_question"

payload = json.dumps({
  "values_list": [
    {
      "label": "question_1",
      "val": "Quelle est la question 1 ?"
    },
    {
      "label": "question_2",
      "val": "Quelle est la question 2 ?"
    },
  ],
  "string_list": [
    "Texte 1 sur lequel portent les deux questions",
    "Texte 2 sur lequel portent les deux questions"
  ]
})
headers = {
  'Authorization': token,
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

Interrogate a list of strings and return an excel

You can also return an excel with the answers directly in the right cells of the excel. In this case, the string list has not been indexed already (since it is passed as text and not id).

import requests
import json

token = 'JWT ' + ''  # set your token here
url = "https://iris.egis-group.com/api/cgpt_structure/task_execute/?label_task=interrogate_string_to_excel"

payload = json.dumps({
  "values_list": [
    {
      "label": "question_1",
      "val": "Quelle est la question 1 ?",
      "column": 5,
      "row": 2
    },
    {
      "label": "question_2",
      "val": "Quelle est la question 2 ?",
      "column": 5,
      "row": 3
    }
  ],
  "string_list": [
    "Texte 1 à interroger",
    "Texte 2 à interroger"
  ]
})
headers = {
  'Authorization': token,
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

Interrogate a list of documents and returns an excel

You can also return an excel with the answers directly in the right cells of the excel

import requests

token = 'JWT ' + ''  # set your token here
url = "https://iris.egis-group.com/api/cgpt_structure/task_execute/?label_task=interrogate_document_to_excel&document_id_list=1"

payload = {'values_list': '[{"label": "question_1", "val": "Quel est le coût d\'intégrer ANZ ?", "column": 5, "row": 2},{"label": "question_2","val": "Quels sont les avantages d\'intégrer ANZ ?","column": 5,"row": 3}]'}
files=[

]
headers = {
  'Authorization': token
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

Last updated