We have submitted a PR to the Open WebUI repo, which would enable Mojeek: https://github.com/open-webui/open-webui/discussions/6588
CEO @ Mojeek No-tracking indepdendent search engine
- 0 Posts
- 11 Comments
Excellent reporting on the trials: https://www.bigtechontrial.com/
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Google is no longer asking — feed the AI or you’re not in search resultsEnglish15·11 months agoSome discussion on that here: https://lemmy.world/comment/11859761
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Google is no longer asking — feed the AI or you’re not in search resultsEnglish2·11 months agoWhere is your evidence for that? It used to be Bing and Yandex, but now it’s just Bing. They use other non search engine APIs and do a small amount of crawling AFAIK. Details of who uses what here: https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Google is no longer asking — feed the AI or you’re not in search resultsEnglish30·11 months agoYou should put these entries into your robots.txt file.
To block the Google search crawler use for all of your site:
User-agent: Googlebot
Disallow: /
To block the Google AI crawler use:
User-agent: Google-Advanced
Disallow: /
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Any “small-web” search engines?English6·11 months agoYes, it was. Matt Wells closed it down just over one year ago.
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Any “small-web” search engines?English5·11 months agoyep, in footer “© 2024 Infospace Holdings LLC, A System1 Company”
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Any “small-web” search engines?English4·11 months agohttps://system1.com/ adtech company syndicating Bing and/or Google
ColinHayhurst@lemmy.worldto Technology@lemmy.world•Any “small-web” search engines?English8·11 months agoWe’d love to build a distributed search engine, but it would be too slow I think. When you send us a query we go and search 8 billion+ pages, and bring back the top 10, 20…up to 1,000 results. For a good service we need to do that in 200ms, and thus one needs to centralise the index. It took years, several iterations and our carefully designed algos & architecture to make something so fast. No doubt Google, Bing, Yandex & Baidu went through similar hoops. Maybe, I’m wrong and/or someone can make it work with our API.
Are the “thousands of companies” hiding their identity?