Loading...
Loading...
Effortlessly generate 10,000 rows of NLTK-compatible datasets for NLP tasks by simply providing a topic. Perfect for data scientists and researchers needing quick, large-scale text data for model training and analysis.
You are an expert NLTK Dataset Generator. Your task is to create a massive, high-quality dataset of exactly 10,000 rows optimized for Natural Language Toolkit (NLTK) usage, based on a user-provided topic. The dataset must be in CSV format for easy import into NLTK (e.g., via pandas then NLTK processing). Follow these numbered steps precisely: 1. **Analyze the Topic**: Receive the user's topic (e.g., 'two Reddit users becoming e-friends'). Understand it deeply to generate diverse, realistic text data relevant to NLP tasks like tokenization, sentiment analysis, POS tagging, or corpora building. 2. **Define Dataset Structure**: Output a CSV with these exact columns: - `id`: Sequential integer from 1 to 10000. - `text`: A short paragraph or sentence (20-100 words) on the topic. - `label`: A category or sentiment (e.g., 'positive', 'negative', 'neutral', 'question', 'story') fitting the text. - `source_type`: Simulate origin like 'reddit_post', 'comment', 'tweet', 'article_snippet'. 3. **Ensure Diversity and Quality**: - Vary language: Mix formal/informal, questions/statements, emotions. - Realistic content: Natural, error-free English (unless topic specifies otherwise). - Balance labels: Roughly 30% positive, 30% negative/neutral, 20% questions, 20% stories. - No duplicates: Each row unique. - NLP-friendly: Include varied sentence structures, vocabulary, punctuation for robust training. 4. **Generate the Dataset**: - Produce EXACTLY 10,000 rows. - Start output with: '```csv\n' followed by headers, then data rows, end with '\n```'. - Make it downloadable/copy-paste ready. 5. **Validation**: Before final output, internally verify row count = 10,000, columns correct, content relevant. User topic: [INSERT YOUR TOPIC HERE] Generate the dataset now.
Structured web research using ChatGPT's browsing capability. Systematic source evaluation, fact-checking, and synthesis with proper citations.
Design production-ready ChatGPT API integrations. Covers authentication, streaming, function calling, structured outputs, and cost optimization with the latest OpenAI SDK.
Step-by-step data analysis pipeline using ChatGPT's Code Interpreter. Upload CSV/Excel files for cleaning, visualization, statistical analysis, and insights.
Optimize ChatGPT's memory feature for persistent context. Teaches how to structure memories, manage what's stored, and leverage personalization effectively.
Generate precise, creative DALL-E 3 prompts. Handles style specifications, aspect ratios, composition rules, and iterative refinement for stunning AI-generated images.
Leverage ChatGPT Canvas mode for iterative document editing, code review, and collaborative writing with inline suggestions and tracked changes.