Search

Fast Search and Transfer Engine!

3 min read
0 views

Indexing and Data Growth: How FAST Keeps Its Catalog Fresh and Expanding

When Martin Schaedel from Sweden set up his phone line with Stephen Baker, Director of the Internet Business Unit at Fast Search & Transfer, he heard more than the usual buzz about crawl cycles and data pipelines. FAST’s approach to indexing isn’t a one‑size‑fits‑all strategy; instead, it blends scheduled refreshes with targeted updates that keep the engine current without draining resources. According to Stephen, FAST divides its index into two tiers of freshness. Roughly thirty percent of the database gets refreshed every seven days, while the remaining seventy percent cycles roughly every four weeks. This staggered approach reduces server load while still giving high‑profile sites near‑real‑time visibility.

Martin asked whether this schedule matches the numbers the company publishes on its website, which claim a nine‑to‑eleven day update window. Stephen clarified that those figures refer to the portion of the index that gets a hard refresh - about a third of the total. The rest of the data, often sourced from slower‑moving directories or niche portals, follows a longer cycle. Still, the company is actively working to tighten those intervals. By investing in smarter crawling algorithms that can flag high‑value content for priority re‑extraction, FAST hopes to bring the overall update cycle down closer to the claimed timeframe.

The real excitement comes when we look at FAST’s size goals. Today the index hosts over nine hundred million URLs, and more than six hundred million of those have been validated through automated checks and human oversight. The next milestone is a two‑billion‑URL index, a target that feels ambitious but reachable with the current growth rate. Stephen pointed out that the majority of new entries will come from fresh pages that pass validation and deliver useful data for the engine’s ranking models. As the number of sites grows, so does the need for higher validation throughput. FAST is addressing that by expanding its crawler fleet and adopting machine‑learning filters to pre‑screen pages for relevance.

Another layer of growth comes from the data that doesn’t exist as simple text URLs. Fast Search & Transfer is already indexing PDF documents, and it’s moving toward integrating these file types into the same search experience that users expect for web pages. That means a PDF on a university’s research page can surface alongside a blog post when someone searches for a scholarly topic. This cross‑format integration is a key part of FAST’s strategy to stay competitive against other engines that treat documents as separate buckets.

Finally, Martin got a sneak peek into FAST’s plans for indexing dynamic content. Dynamic pages - those that change based on user input or session data - have traditionally been tricky for crawlers. Stephen explained that FAST uses a combination of parameter detection and headless browsers to render pages and capture the final HTML. The resulting snapshots become part of the index, ensuring that even highly interactive sites contribute to search results. This ability to index dynamic content not only broadens the coverage of the index but also makes it easier for marketers to measure how their dynamic assets perform in the wild.

With these upgrades, FAST positions itself as an engine that doesn’t just pull the latest web content - it curates a massive, validated collection that grows continuously. The company’s roadmap indicates that the index will soon exceed two billion URLs, opening new avenues for advanced search features and richer user experiences. For webmasters, that means a larger audience ready to discover their content, as long as they keep their sites well‑structured and up‑to‑date. The conversation with Stephen left Martin - and us - confident that FAST is ready to keep pace with the ever‑expanding internet.

Pay‑for‑Inclusion and New Webmaster Tools: Unlocking Visibility Through Partnership

Fast Search & Transfer’s latest initiative, InSite Select, launched in partnership with Lycos, marks a shift toward a more business‑centric model for visibility. Stephen explained that paid inclusion is not simply about getting a site listed; it’s a curated process that ensures only high‑quality, relevant pages appear in search results. By reviewing submissions before they hit the index, FAST can sidestep many of the pitfalls that plague free crawlers, such as duplicate content or cloaked pages that serve deceptive signals to users.

One of the most striking differences between free and paid inclusion is how the engine handles dynamic content. When a webmaster pays for inclusion, the crawler gains priority access and can render JavaScript, CSS, and server‑side scripts before it logs the page. This means that sites that rely on AJAX calls or client‑side rendering can have their full content captured accurately. In contrast, the free crawler must make do with static snapshots, which can miss critical elements like product images or interactive maps.

Stephen highlighted several new tools on the horizon for those who opt into the paid program. The click‑through metric will provide insights into how often users click on a particular result versus how many times the result appears. That data can help marketers tweak their meta titles and descriptions for better performance. Keyword reporting will give visibility into which search terms actually drive traffic, allowing for more targeted SEO strategies. Together, these tools aim to make the paid experience not just faster, but smarter.

The economics of the program are also worth noting. While the company didn’t disclose exact pricing, Stephen emphasized that the goal is to make the service accessible to mid‑sized businesses while still offering premium features for larger enterprises. By providing a tiered pricing model, FAST hopes to democratize the benefits of accurate indexing without sacrificing revenue. The partnership with Lycos also expands the distribution network, giving paid sites a broader reach across multiple platforms.

From a webmaster’s perspective, the biggest advantage of paid inclusion is the guarantee of index freshness. Fast Search & Transfer’s crawl schedules, which prioritize paid sites, mean that newly updated pages will surface in search results within hours. For e‑commerce sites with rapidly changing inventory or for news outlets that push fresh stories daily, this rapid turnaround can be a game‑changer. Additionally, the quality review process reduces the risk of being penalized for cloaked or duplicate content - a concern that often plagues the free space.

In short, InSite Select positions Fast Search & Transfer as a service that rewards quality and offers tangible metrics to refine performance. For webmasters who need a guaranteed place in the fast‑moving search landscape, the partnership with Lycos and the new tooling suite provide a compelling reason to consider paid inclusion over the traditional free crawl.

Multimedia and File Type Indexing: Expanding Beyond Text to PDFs, GIFs, and Flash

Fast Search & Transfer is not content‑agnostic; it actively seeks to make diverse file types part of its core search experience. Stephen discussed the company’s current capabilities and future ambitions in indexing formats that often get sidelined by mainstream search engines. The first format on FAST’s radar is PDF, a staple of academic research and corporate reports. The engine already extracts text from PDFs and adds those results to the index, but the next step is to merge PDF relevance with web‑page relevance so that a PDF can surface alongside a blog post when someone searches for a scholarly term.

Beyond PDFs, FAST already has a system in place to index GIF files. This isn’t just about recognizing the file extension; it involves parsing the GIF’s metadata and, in some cases, performing a basic text extraction from embedded alt attributes. The goal is to surface GIFs that carry meaningful captions or explanatory text. While GIFs are primarily visual, they often serve a communicative purpose in forums or social media, and FAST’s indexing allows those visuals to be discoverable via keyword queries.

The company has also explored multimedia files beyond static images. Audio and video files present a larger challenge due to size and the need for transcription. While FAST has not yet launched a full multimedia search feature, the conversation with Stephen revealed plans to collaborate with third‑party speech‑to‑text services to generate searchable transcripts for popular video formats. This would enable users to search for specific phrases within a video, thereby broadening the engine’s utility for educators and content creators.

Flash files, once the gold standard for web animation and interactivity, still remain a niche area for indexing. Stephen mentioned ongoing discussions with Macromedia about converting Flash to HTML5 so that its content can be rendered by a headless browser. Although the timeline for Flash indexing is uncertain, the willingness to explore such a conversion shows FAST’s commitment to staying ahead of evolving web standards. As browsers move toward native HTML5 support, early adoption of Flash conversion could position FAST as a forward‑looking engine.

The technical backbone of these multimedia efforts relies on robust extraction pipelines. FAST employs a combination of OCR for image‑based text, embedded metadata parsing, and headless browser rendering for dynamic pages. The extracted data feeds into the same ranking algorithm that governs text pages, ensuring a consistent relevance signal across file types. This parity means that a user searching for “company annual report 2024” can find the PDF version and any supporting charts or images, all ranked together.

From an SEO standpoint, these capabilities open new avenues for visibility. Content creators can now include PDF brochures, GIF infographics, or even archival Flash presentations, confident that FAST’s index will surface them alongside web pages. For webmasters who publish technical documentation, product catalogs, or multimedia tutorials, the expanded file‑type coverage can translate into higher click‑through rates and longer session durations.

In conclusion, FAST’s focus on multimedia and file‑type indexing demonstrates a broader vision of search: to capture every piece of information that the web offers, regardless of format. By continually adding new file types to its index and ensuring they compete fairly with standard web pages, Fast Search & Transfer strengthens its position as a comprehensive search solution for both businesses and end users.

Global Presence and Spam Policy: Building a Worldwide Network While Guarding Search Quality

Fast Search & Transfer’s growth strategy has always been anchored in a global perspective. Stephen outlined how the company began in Europe, where major search players were scarce, and how it now maintains partners in every continent. This geographic diversity translates into a richer index because local content gets better visibility, and it also allows FAST to adapt its crawling priorities to region‑specific search behaviors.

One of the newest partnerships is with Telus, a leading Canadian telecommunications provider. The alliance enables FAST to embed its search results directly into Telus’s network of websites, increasing the engine’s reach among Canadian users. By integrating search into a widely used ISP platform, FAST can gather user interaction data that informs its ranking algorithms - an advantage that helps keep the results relevant for each region.

The company’s partner ecosystem extends beyond ISP integrations. FAST works with regional search aggregators, portal operators, and content syndicators across North America, South America, Asia, and Africa. These collaborations help Fast Search & Transfer discover new content quickly and ensure that local regulations around data privacy and web crawling are respected. The result is a more trustworthy index that balances depth with compliance.

While expanding worldwide, FAST remains vigilant against spam and cloaking. Stephen explained that the engine distinguishes between paid and free inclusion when assessing cloaked pages. For free submissions, the crawler cannot determine the page’s intent, so any suspicious cloaking triggers removal or a penalty. For paid sites, the review process includes manual checks that allow the team to gauge the page’s purpose before it lands in the index. This tiered approach means that legitimate dynamic sites that use cloaking for personalization are not penalized, while deceptive practices remain blocked.

The spam handling workflow also extends to user reports. FAST has set up a dedicated email address, spam@fastsearch.com, where webmasters or everyday users can flag suspicious results. Stephen assured that the review team actively monitors this channel and incorporates user feedback into the search quality team’s training data. By involving the community, FAST demonstrates a commitment to maintaining high search integrity across all regions.

From an SEO perspective, these global strategies have practical implications. If a site targets a multilingual audience, partnering with FAST’s regional networks can increase visibility in non‑English markets. The spam policy, meanwhile, signals to webmasters that adherence to search best practices - such as avoiding cloaking or duplicate content - will pay off in terms of index inclusion and ranking stability.

Overall, Fast Search & Transfer’s worldwide network, combined with a nuanced spam policy, positions it as a search engine that values both reach and quality. By offering transparent processes for paid inclusion and maintaining rigorous standards against spam, FAST provides a reliable platform for businesses looking to reach audiences across the globe.

Suggest a Correction

Found an error or have a suggestion? Let us know and we'll review it.

Share this article

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Related Articles