KameronB commited on
Commit
1e0ea4b
1 Parent(s): 1b3f4c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md CHANGED
@@ -1,3 +1,117 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-2-text
7
+ - natural-language
8
+ - nlp
9
+ - classification
10
+ - call center
11
+ - IT
12
+ - summarization
13
+ - text-generation
14
  ---
15
+ # SITCC-T5-Classifier Model Card
16
+
17
+ ## Model Description
18
+ The SITCC-T5-Classifier model is a fine-tuned version of the google/flan-t5-base model. It has been specifically trained to process IT ticket descriptions and extract the request/issue and the software/system that the ticket is about. The model was fine-tuned using 5716 synthetically generated input/output pairs generated with OpenAI GPT-4 Turbo.
19
+
20
+ ## Model Details
21
+ - Base Model: google/flan-t5-base
22
+ - Fine-tuning Data: 5716 synthetic IT ticket description pairs generated by OpenAI GPT-4 Turbo
23
+
24
+ ## Intended Use
25
+ The SITCC-T5-Classifier model is designed to be used for IT ticket classification and information extraction tasks. It can be used to automatically identify the request/issue and the software/system mentioned in an IT ticket description.
26
+
27
+ ## Limitations and Known Issues
28
+ - The model's performance may vary depending on the quality and diversity of the input IT ticket descriptions.
29
+ - The model may struggle with understanding complex or ambiguous ticket descriptions.
30
+ - The model may not perform well on ticket descriptions that are significantly different from the training data.
31
+
32
+ ## Example Usage
33
+ This example is running on cpu
34
+ ``` python
35
+ import re
36
+ import pandas as pd
37
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
38
+
39
+ from time import perf_counter
40
+
41
+ class SITCC_T5_Classifier:
42
+ """
43
+ A class for classifying text using the SITCC T5 model.
44
+
45
+ Attributes:
46
+ tokenizer (T5Tokenizer): The tokenizer for the T5 model.
47
+ model (T5ForConditionalGeneration): The T5 model for classification.
48
+ """
49
+
50
+ def __init__(self):
51
+ # Load the tokenizer and model from the fine-tuned model directory
52
+ self.tokenizer = T5Tokenizer.from_pretrained("KameronB/sitcc-t5-classifier")
53
+ self.model = T5ForConditionalGeneration.from_pretrained("KameronB/sitcc-t5-classifier", device_map="cpu")
54
+
55
+ def process_response(self, response:str) -> dict:
56
+ """
57
+ Process the response and extract the software/system and issue/request.
58
+
59
+ Args:
60
+ response (str): The response text.
61
+
62
+ Returns:
63
+ dict: A dictionary containing the software/system and issue/request.
64
+ """
65
+ matches = re.search(r'Software/System: (.*) Issue/Request: (.*)</s>', response, re.DOTALL)
66
+ return {
67
+ "Software/System": matches.group(1),
68
+ "Issue/Request": matches.group(2)
69
+ }
70
+
71
+ def classify_entry(self, entry:str, max_new_tokens=60) -> dict:
72
+ """
73
+ Classify the input text and return the classification results.
74
+
75
+ Args:
76
+ entry (str): The input text to be classified.
77
+ max_new_tokens (int): The maximum number of tokens to generate.
78
+
79
+ Returns:
80
+ dict: The classification results.
81
+ """
82
+ # Tokenize the input text
83
+ input_ids = self.tokenizer(entry, return_tensors="pt").input_ids.to("cpu")
84
+
85
+ # Generate the output text
86
+ outputs = self.model.generate(input_ids, max_new_tokens=max_new_tokens)
87
+
88
+ # Decode and return the output text
89
+ return self.process_response(self.tokenizer.decode(outputs[0]))
90
+
91
+ # Create the SITCC T5 Classifier wrapper class for the fine-tuned T5 model
92
+ sitcc_t5 = SITCC_T5_Classifier()
93
+
94
+ # Define the input text
95
+
96
+ input_text = [
97
+ "The customer is getting the following error when using rSATS:\nERROR: 'Failed to connect'. \nI have tried restarting the application and the computer, but the issue persists. \nEscalating to Team",
98
+ "The customer is experiencing issues with their network connectivity, which is causing slow internet speeds and frequent disconnections.",
99
+ "The customer is unable to access the shared drive on the network. They receive an error message stating 'Network path not found'. \nEscalating to Network Team",
100
+ "The customer is unable to print from their computer. They have checked the printer connections and restarted the printer, but the issue persists. \nEscalating to Printer Support Team",
101
+ ]
102
+
103
+ # measure the time performance of the model
104
+ start = perf_counter()
105
+ for i in range(len(input_text)):
106
+ # Classify the input text
107
+ print(sitcc_t5.classify_entry(input_text[i]))
108
+
109
+ # measure the time performance of the model
110
+ end = perf_counter()
111
+ print(f"Time taken: {end - start} seconds")
112
+
113
+
114
+
115
+
116
+
117
+ ```