discourser

A Streamlit application for analyzing the influence of a core corpus of materials against another corpus.

Python 3.10

Installation

Clone this repository
Create a virtual environment: python -m venv venv
Activate virtual environment: source venv/bin/activate (Linux/Mac) or venv\Scripts\activate (Windows)
Install dependencies: pip install -r requirements.txt

You can create a conda environment with:

conda create -n discourser python=3.10
conda activate discourser

Running the Application

streamlit run app.py

Usage

Project Setup: Create or load a project
Core Corpus: Upload CSV metadata + text files for influential texts
Target Corpus: Upload corpus to analyze for influence
Analysis:
Results:

Nb the 'save project' is just saving your core and target corpuses; you still need to initialize the embeddings and so on.

Data Format

CSV Metadata

Required columns: filename, title, author, date, source Optional columns: document_type

Text Files

UTF-8 encoded .txt files
Paragraphs separated by double blank lines
Filename must match CSV metadata

graph TD
    A[Project Setup] --> B[Core Corpus Upload]
    A --> C[Target Corpus Upload]
    
    B --> D[Core Corpus Processing]
    C --> E[Target Corpus Processing]
    
    D --> F[Core Embeddings Generation]
    E --> G[Target Embeddings Generation]
    
    F --> H[Term Extraction from Core]
    H --> I[Suggested Terms for Vectors]
    
    F --> J[Core-Target Similarity Matrix]
    G --> J
    
    J --> K[Most Influential Core Texts]
    
    I --> L[Custom Vector Creation]
    F --> L
    
    L --> M[Vector Projection Analysis]
    F --> M
    G --> M
    
    M --> N[2D/3D Vector Spaces]
    
    F --> O[Topic Modeling on Core]
    O --> P[Topic-Target Analysis]
    G --> P
    
    J --> Q[Results & Visualization]
    K --> Q
    N --> Q
    P --> Q
    
    subgraph "Core Corpus Flow"
        B
        D
        F
        H
        I
        O
    end
    
    subgraph "Target Corpus Flow"
        C
        E
        G
    end
    
    subgraph "Comparative Analysis"
        J
        K
        M
        N
        P
    end
    
    subgraph "Vector Analysis"
        L
        M
        N
    end
    
    style B fill:#e1f5fe
    style C fill:#f3e5f5
    style F fill:#e8f5e8
    style G fill:#fff3e0
    style J fill:#ffebee
    style L fill:#f1f8e9
    style O fill:#e0f2f1

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
test-source-corpus		test-source-corpus
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

discourser

Installation

Running the Application

Usage

Data Format

CSV Metadata

Text Files

About

Uh oh!

Releases

Packages

Languages

XLabCU/discourser

Folders and files

Latest commit

History

Repository files navigation

discourser

Installation

Running the Application

Usage

Data Format

CSV Metadata

Text Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages