CrawlIQ Technical SEO Evaluation Methodology
A step-by-step breakdown of the five-stage AI technical SEO evaluation process — from URL submission to structured Excel reference document.
The Five-Stage Evaluation Methodology
Step 1: Submit the Evaluation Target URL
Navigate to the CrawlIQ reference system and provide any website URL as the evaluation target. You can configure:
- Page limit — 1 to 50 pages without login; up to 200 with a free account
- Evaluation depth — how many link levels deep to follow from the root URL
- AI provider — Groq (default, fastest), Gemini, Claude, OpenAI, or rules-based fallback
No credit card or signup is required for the first 50 pages of evaluation.
Step 2: Async BFS Evaluation Engine Examines Pages
CrawlIQ's evaluation engine uses Python aiohttp to process all pages concurrently using breadth-first search (BFS) from your root URL. For each page it extracts:
- Title tag, meta description, canonical URL
- H1, H2, H3 heading structure and all heading text
- Open Graph and Twitter Card meta tags
- All internal links (to discover the next BFS layer)
- Image
altattributes and image count - Body text word count and keyword frequency
- HTTP status code, redirect chain if any, and final URL
SSL failures fall back to HTTP automatically so the evaluation never stops on certificate errors. Real-time progress is streamed to your browser via Server-Sent Events (SSE).
Step 3: Signal Identification and Page Scoring
Every evaluated page is examined through 50+ checks across 15 issue categories. Each page receives a score from 0–100 and an A–F grade. Signals identified include:
- Title tag missing, too short (<30 chars), too long (>60 chars), or duplicated
- Meta description missing, too short, too long, or duplicated across pages
- H1 missing, multiple H1 tags, or H1 identical to title
- Canonical absent, incorrect, or forming a redirect loop
- Pages accidentally marked
noindexor blocked byX-Robots-Tag— unintentional blocking signals - Broken links (4xx) and server errors (5xx) within internal links
- Redirect chains consuming indexation budget (evaluation coverage impact)
- Images missing alt text
- Thin content pages with under 300 words
- Non-HTTPS pages or mixed content warnings
Step 4: AI Model Produces Remediation Documentation
After the evaluation completes, select individual pages or process all results with AI at once. CrawlIQ builds a structured prompt for each page containing its current SEO signals and identified issues, then sends it to your chosen AI model.
Each AI remediation response includes:
- Optimised title tag (within 50–60 character target)
- Rewritten meta description (within 140–155 character target)
- Improved H1 tag aligned with primary keyword intent
- Content recommendations: gaps to fill, sections to add, topics to cover
- Priority ranking of all detected issues by SEO impact
All remediation guidance is stored in the Optimisation Reference Table for bulk review and export.
Step 5: Export Structured Excel Reference Document
Export a multi-sheet Excel workbook (.xlsx) containing:
- Full SEO Report — every page with score, grade, all identified issues, and keyword list
- Technical SEO Assessment — per-page 0–100 scores with A–F grades and indexability status
- Optimisation Reference Table — optimised titles, descriptions, and H1s ready to paste into your CMS
- Per-Field Issues — each issue type as a separate column for bulk CMS upload workflows
- AI Remediation Guidance — complete AI output for each page including all recommendations
Technical Methodology Questions
Does CrawlIQ evaluate JavaScript-rendered pages?
CrawlIQ uses an HTML-only evaluation engine (aiohttp + BeautifulSoup). It reads the initial HTML response returned by the server — the same HTML that Googlebot reads on a first-pass examination. JavaScript-rendered SPAs that return empty HTML shells will show limited content; server-side rendered or static sites receive full evaluation coverage.
How long does an evaluation take?
CrawlIQ processes fully asynchronous concurrent requests. A 50-page evaluation typically completes in 15–45 seconds depending on your server response times. The first evaluation on HuggingFace's free tier may take 30–90 seconds for the container to wake up — a cold-start notice is shown automatically.
Is evaluation data stored anywhere?
No. All evaluation data is processed in an isolated in-memory session on the HuggingFace container. Data is not written to any database or disk. When a new evaluation begins, the previous session is cleared. The system is fully open-source — you can verify this in the GitHub repository.
Which AI model gives the best evaluation results?
Groq (Llama 3 70B) is the fastest and works well for bulk evaluation. Claude (Anthropic) gives the most nuanced content quality assessments. Gemini performs well for multilingual content. For most users, Groq's free API is the best starting point — no credit card required at console.groq.com.
SEO Signal Score Methodology: 0–100 Scale Explained
Every page the evaluation engine examines receives a composite score from 0 to 100 and a letter grade from A to F. The score is a weighted average across five signal categories:
- Title tag quality (25 points) — presence, length, uniqueness, keyword placement
- Meta description quality (20 points) — presence, length range, uniqueness, CTA language
- Heading structure (20 points) — single H1, H1 contains keyword, logical H2/H3 hierarchy
- Indexability and canonicalisation (20 points) — canonical tag correct, no accidental noindex, no redirect chains
- Content quality (15 points) — word count above 300, images have alt text, no thin-content flags
Pages scoring 90–100 receive an A grade. 75–89 = B; 55–74 = C; 35–54 = D; below 35 = F. A missing title tag penalises more heavily than a slightly-too-long meta description. The Optimisation Reference Table sorts all evaluated pages by score ascending so the lowest-scoring pages appear at the top, letting you prioritise remediation order efficiently.
What CrawlIQ Automates vs What Still Requires Human Judgment
CrawlIQ automates the measurable, rule-based layer of technical SEO evaluation. Understanding this distinction helps you get maximum value from the reference system.
Automated by CrawlIQ:
- Detecting missing, duplicate, or malformed title tags and meta descriptions across every page
- Finding broken internal links, redirect chains, and pages blocked to Googlebot
- Flagging missing canonical tags or canonicals pointing to non-200 URLs
- Identifying thin content pages, orphan pages, and images without alt text
- Producing AI-improved title, description, and H1 suggestions per page
- Exporting everything to a structured Excel workbook for your team
Still requires human judgment:
- Evaluating whether the content genuinely satisfies search intent for a given keyword
- Assessing E-E-A-T signals (Experience, Expertise, Authoritativeness, Trustworthiness)
- Deciding whether to consolidate, redirect, or delete underperforming pages
- Reviewing Core Web Vitals and page experience signals (use PageSpeed Insights for this)
- Building backlinks — no evaluation system can automate authority acquisition
CrawlIQ is most effective as the first pass of a comprehensive evaluation. It surfaces every measurable on-page and technical issue in minutes, freeing you to spend your time on the strategic decisions a reference system cannot make for you.
Why BFS Evaluation Order Matters for SEO Coverage
CrawlIQ uses breadth-first search (BFS) rather than depth-first search (DFS) for a specific reason: BFS discovers your most-linked-to pages first. Pages in your main navigation, homepage body links, and footer are evaluated before deep subcategory or pagination pages. This mirrors how Googlebot allocates indexation budget — it prioritises pages with more internal links pointing to them.
For a 50-page evaluation limit, this means CrawlIQ always assesses your most important pages first. If your site has 500 pages and you set a limit of 50, you get an accurate picture of your highest-priority URLs rather than an arbitrary sample from a single deep category chain.
The evaluation engine also deduplicates URLs by stripping UTM parameters, trailing slashes, and common tracking query strings before adding them to the evaluation queue — so https://example.com/?utm_source=newsletter and https://example.com/ count as one URL, not two.