Module: css_inference_workflow
CSS Inference Workflow
CSS (Closed-Source Software) workflow is a class for running inference on various closed-source text-based models. Currently, the following APIs are supported:
- OpenAI completions
- OpenAI embeddings
- Perplexity AI completions
- GooseAI completions
Constructor Arguments
api_keys
: API keys for the closed-source modelretry_params
: Retry parameters for the closed-source model
Additional Installations
Since this workflow uses some additional libraries, you'll need to install infernet-ml[css_inference]
.
Alternatively, you can install those packages directly. The optional dependencies "[css_inference]"
are provided for your convenience.
Completions Example
The following is an example of how to use the CSS Inference Workflow to make a request to the OpenAI's completions API.
import os
from dotenv import load_dotenv
from infernet_ml.utils.css_mux import (
ApiKeys,
ConvoMessage,
CSSCompletionParams,
CSSRequest,
Provider,
)
from infernet_ml.workflows.inference.css_inference_workflow import CSSInferenceWorkflow
load_dotenv()
api_keys: ApiKeys = {
Provider.OPENAI: os.getenv("OPENAI_API_KEY"),
}
def main():
# Instantiate the workflow
workflow: CSSInferenceWorkflow = CSSInferenceWorkflow(api_keys)
# Setup the workflow
workflow.setup()
# Define the parameters for the completions API
params: CSSCompletionParams = CSSCompletionParams(
messages=[ConvoMessage(role="user", content="hi how are you")]
)
# Define the request
req: CSSRequest = CSSRequest(
provider=Provider.OPENAI, endpoint="completions", model="gpt-3.5-turbo-16k", params=params
)
# Run the model
response = workflow.inference(req)
print(response)
if __name__ == "__main__":
main()
Running the script above will make a request to the OpenAI's completions API and print the response:
Streaming Example
The following is an example of how to use the CSS Inference Workflow to stream the results from the OpenAI's completions API.
from infernet_import os
from dotenv import load_dotenv
from infernet_ml.utils.css_mux import (
ApiKeys,
ConvoMessage,
CSSCompletionParams,
CSSRequest,
Provider,
)
from infernet_ml.workflows.inference.css_inference_workflow import CSSInferenceWorkflow
load_dotenv()
api_keys: ApiKeys = {
Provider.OPENAI: os.getenv("OPENAI_API_KEY"),
}
def main():
# Instantiate the workflow
workflow: CSSInferenceWorkflow = CSSInferenceWorkflow(api_keys)
# Setup the workflow
workflow.setup()
# Define the parameters for the completions API
params: CSSCompletionParams = CSSCompletionParams(
messages=[ConvoMessage(role="user", content="hi how are you")]
)
# Define the request
req: CSSRequest = CSSRequest(
provider=Provider.OPENAI,
endpoint="completions",
model="gpt-3.5-turbo-16k",
params=params
)
# Run the model and stream the response
for response in workflow.stream(req):
print(response)
if __name__ == "__main__":
main()
Outputs:
Other Inputs
To explore other inputs, check out the inference() method's arguments.
CSSInferenceWorkflow
Bases: BaseInferenceWorkflow
Base workflow object for closed source LLM inference models.
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 |
|
__init__(api_keys, retry_params=None)
Constructor. Any named arguments are passed to closed source LLM during inference.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
server_url |
str
|
url of inference server |
required |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
do_generate_proof()
do_postprocessing(input_data, gen_text)
Implement any postprocessing here. For example, you may need to return additional data. By default, returns a dictionary with a single output key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data |
dict[str, Any]
|
original input data from client |
required |
gen_text |
str
|
str result from closed source LLM model |
required |
Returns:
Name | Type | Description |
---|---|---|
Any |
Union[Any, dict[str, Any]]
|
transformation of the gen_text |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
do_preprocessing(input_data)
Validate input data and return a dictionary with the provider and endpoint.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data |
CSSRequest
|
input data from client |
required |
Returns:
Name | Type | Description |
---|---|---|
CSSRequest |
CSSRequest
|
validated input data |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
do_run_model(preprocessed_data)
Inference implementation. Generally, you should not need to change this implementation directly, as the code already implements calling a closed source LLM server.
Instead, you can perform any preprocessing or postprocessing in the relevant abstract methods.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
preprocessed_data |
CSSRequest
|
user input |
required |
Returns:
Type | Description |
---|---|
Union[str, list[Union[float, int]]]
|
Union[str, list[Union[float, int]]]: result of inference |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
do_setup()
do_stream(_input)
Stream results from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
_input |
CSSRequest
|
input data from client |
required |
Returns:
Type | Description |
---|---|
Iterator[str]
|
Iterator[str]: stream of results |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
inference(input_data, log_preprocessed_data=True)
Perform inference on the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data |
CSSRequest
|
input data from client |
required |
Returns:
Name | Type | Description |
---|---|---|
Any |
Any
|
result of inference |
Source code in src/infernet_ml/workflows/inference/css_inference_workflow.py
stream(input_data)
Stream results from the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data |
CSSRequest
|
input data from client |
required |
Returns:
Type | Description |
---|---|
Iterator[str]
|
Iterator[str]: stream of results |