Enhancing-Customer-Complaint-Resolution-Using-TF-IDF-Information-Retrieval-System-and-Graph-Mining

1. Introduction

This project focuses on text mining and similarity analysis of complaints in Arabic and English. By leveraging NLP techniques, the project aims to preprocess textual data and determine the similarity of user queries to existing complaint descriptions.

2. Purpose

The primary purpose of this project is to enable users (employees) to find similar complaints based on their input query. By employing techniques such as TF-IDF vectorization and cosine similarity, the project enhances the retrieval of relevant documents, making it easier for users to access pertinent information.

3. Data Source

The dataset used for this project is a closed-source dataset from a popular telecommunication company in Amman, Jordan.

4. Packages Used

The following Python packages are utilized in this project:

pandas: For data manipulation and analysis.
numpy: For numerical computations.
nltk: For natural language processing tasks including tokenization, stemming, and stopword removal.
tashaphyne: For Arabic text processing and stemming.
langid: For language identification.
sklearn: For implementing TF-IDF vectorization and cosine similarity calculations.
networkx: For creating and visualizing graphs.
matplotlib: For plotting graphs.
tkinter: For creating the user interface.

5. Mechanism

The project follows these key steps:

Text Preprocessing: The raw text data is cleaned and processed. This includes tokenization, stopword removal, stemming, and handling Arabic diacritics.
TF-IDF Vectorization: The preprocessed text is transformed into a TF-IDF matrix, which represents the importance of words in the documents relative to the entire dataset.
Cosine Similarity Calculation: Similarity scores are computed between the user query and the existing complaint descriptions to identify the most relevant documents.
Co-occurrence Analysis: The project also includes an analysis of word co-occurrences in the top similar documents to identify relationships between terms.
Graph Visualization: Results are visualized through directed and weighted graphs, showcasing the relationships and similarities among complaints.

6. Results

The output of the project includes:

The top 5 most similar documents based on the user query and their cosine similarity scores.
Directed and weighted graph visualizations that illustrate the relationships between terms and the structure of similar complaints.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Information Retrieval Application.ipynb		Information Retrieval Application.ipynb
README.md		README.md
Text Mining.ipynb		Text Mining.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing-Customer-Complaint-Resolution-Using-TF-IDF-Information-Retrieval-System-and-Graph-Mining

1. Introduction

2. Purpose

3. Data Source

4. Packages Used

5. Mechanism

6. Results

About

Releases

Packages

Languages

Khoulii/Enhancing-Customer-Complaint-Resolution-Using-TF-IDF-based-Information-Retrieval-and-Graph-Mining

Folders and files

Latest commit

History

Repository files navigation

Enhancing-Customer-Complaint-Resolution-Using-TF-IDF-Information-Retrieval-System-and-Graph-Mining

1. Introduction

2. Purpose

3. Data Source

4. Packages Used

5. Mechanism

6. Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages