WbSrch Search Engine Releases New Data Offerings

Reprint of a press release originally published on https://www.prweb.com/releases/wbsrch_search_engine_releases_new_data_offerings/prweb17779070.htm.

The WbSrch search engine has released new data offerings.

Two new domain lists are available for purchase. The first is a list of all of the internet domains with pages in the WbSrch search engine index.

The second is a list of all adult-content domains that have been excluded from the search engine.

In addition to these, WbSrch also sells a list of the top 1 million most-linked-to domains based on its link index.

This data will be a useful tool for data scientists, SEO specialists, and for entrepreneurs who want to build domain-information-based product offerings. Founder Jason Champion had this to say about the project launch:

“When I started this search engine, there was a terrible shortage of good data sources for getting started with organizing the web. With this release we hope to make it easier to start web-data-driven projects.”

These new files are available in the WbSrch data shop at https://wbsrch.onfastspring.com/ priced at $39.99 and $4.99, respectively.

Smarthost.net: Ultimately Just a Waste of Time

Smarthost.net has some nice “storage server” deals with some very configurable options. If you want a VPS with 1TB of disk space, their offerings are pretty attractive.

For my search engine, I need hosting with a good chunk of disk space in order to hold the index. It doesn’t need to be fast storage, and it doesn’t need a lot of CPU and RAM — retrieval of index entries is fast and efficient.

This made Smarthost look pretty ideal, so I signed up and got a web server going. It worked well for about a month. So I decided to set up a second one to do some light web crawling (you don’t get enough cores on their plans to do anything heavy).

After about a week, both machines were unreachable. I contacted support and found out that the drive array had failed on the machine hosting both of them. Support tried to recover it, but ultimately it was a total failure. So the little bit of web crawling data and the search engine log data for about 2 weeks (since the last time I pulled it) was destroyed.

Annoying, but hardware failures happen.

A week later I found my crawler machine suspended because of a false positive on Spamhaus. Apparently their system is so badly-written that just visiting a domain with a web crawler can get you on a “bad list” for supposedly hosting a virus/malware. Many hosting providers, Smarthost included, will auto-suspend service for any box that gets on that list.

I got that machine removed from Spamhaus the same day and had it reactivated a few hours later to download the 200k pages or so that had been crawled, but support was pretty snarky about it. Clearly Smarthost is not a service that is compatible with what I do.

I ended up moving the web server to 1tbvps, which is slightly more expensive, but has more CPU cores and RAM, which is always nice. I moved the crawler to Digital Ocean, which is a very data-science-friendly service. We’ll see whether I have issues with those, but I suspect they will work better for my purposes.

Ultimately my 2 month experience with Smarthost ended up being a complete waste of time.

 

New Web Browser: Scleroglossa

For quite a while I’ve wanted to build a web browser based on the Gecko engine by Mozilla, which is what powers Firefox. Until recently I never had the right combination of time and motivation to dig in.

Well, now that I have, here’s the result – the Scleroglossa browser for Windows.

It’s available for download on the Lambda Centauri website.

Cleaner URLs Without Tracking Nonsense

Have you ever seen a link with a bunch of extra stuff on it? Facebook URLS with “fbclid=<big string of letters” or links with a bunch of “utm_medium=<whatever>” or those horrendously long product links you get from Amazon?

They’re used for tracking behavior, and handy for people getting marketing and attribution data. If you don’t mind them, that’s cool. They annoy me a little because I like clean, readable URLs.

There’s a browser extension to get rid of them, called ClearURLs:

Chrome: https://chrome.google.com/webstore/detail/clearurls/lckanjgmijmafbedllaakclkaicjfmnk/related

Firefox: https://addons.mozilla.org/en-US/firefox/addon/clearurls/

I Don’t Care About Cookies

I’m tired of websites showing me cookie warnings that I have to click through to remove some sort of overlay that obscures some portion of the site. I have not nor will I ever care about cookies. They’re a built-in part of the browser that should just work invisibly, and they’re an important part of making apps work.

There’s an extension that’s called, appropriately, “I Don’t Care About Cookies”. Here it is:

Firefox: https://addons.mozilla.org/en-US/firefox/addon/i-dont-care-about-cookies/

Chrome: https://chrome.google.com/webstore/detail/i-dont-care-about-cookies/fihnjjcciajhdojfnbdddfaoknhalnja

Galaksion Didn’t Work For Me

I’ve removed Galaksion from the add networks being used on my site.

It’s a shame that I had to, because I had a generally good feeling about them. I liked their website, their publisher dashboard, and how easy it was to sign up and set up my site to use them.

Their ad fill rate was not great, but I don’t expect that of a new site that hasn’t been sent much traffic yet. The ads that did fill at least had a non-zero CPM rate.

But there’s one thing that happened that I can’t live with.

While testing some updates to my site I clicked on a text box to edit some details. A new tab automatically opened to some ad-based destination. I did not click on an ad to get there. I might have inaccurately attributed an auto-navigation to AdsTerra a few days ago, but they would have been removed anyway due to their slow script and poor reputation and I don’t care to do the research to figure out which of the two was the real culprit in that particular malvertising navigation (that was probably AdsTerra, I’m just not 100% confident).

Anything that will clickjack or hinder my site’s functionality is not welcome, so they’ve been removed from the rotation. Now we’re down to just RevenueHits and Bidvertiser.

I really wish this process involved me removing providers because I didn’t earn as much with them rather than dealing with technical issues that break my site. How is above-ground malvertising and clickjacking even still a thing in 2020? I should have to scour the dark web to find ad networks that behave that way.

Adsterra Didn’t Work For Me

I noticed that pages on my site would hang for a WHILE, so I did some investigation.

It turns out that the Adsterra JavaScript code was loading https://www.gatetodisplaycontent.com/7cb8fde86d9eb121ba106553cdc48d1a/invoke.js which would take more than a minute to finish (1.1 minutes according to the browser’s developer tools). It was a blocking call, so my site would not load while this script was executing.

Since breaking my site is unacceptable, Adsterra has been kicked out of the rotation.

That extremely long load time might also explain the terrible fill rate.

Adsterra Statistics

Bidvertiser served 3 times as many impressions on the same site over the same time period and did not prevent my site from loading.

I also noticed some weird behavior and could not easily figure out which script was causing it. Once in a while when I would click in a text box, a new tab would open to product.directredirection.com with a spammy malware site that was trying to trick me into installing a Chrome extension.

The URL:

directredirection.com malware

The page:

product.direcredirection chrome extension malvertising

More investigation revealed that Adsterra has been known to serve malvertising in the past:

https://blog.malwarebytes.com/cybercrime/2016/04/magnitude-ek-activity-at-its-highest-via-adsterra-malvertising/

https://searchsecurity.techtarget.com/news/252473229/Adsterra-still-connected-to-malvertising-campaign-despite-denials

I removed their JavaScript snippet and haven’t had that malware auto-navigation happen again.

Now it’s down to Bidvertiser, Galaksion, and RevenueHits.

Adsaro Didn’t Work For Me

I’m not sure that Adsaro even works. In the past few days I’ve had zero impressions. Maybe I set up their JavaScript block wrong, but the fact that pages on their site take an eternity to load makes me think it’s them, not me.

Adsterra, Bidvertiser, Galaksion, and RevenueHits are still in the running.

Media.net Didn’t Work For Me

I received a message that my site was disapproved today, so they’re the first ones to fail out of my newest ad network comparison experiment.

Looking closer at their policies I see this (bold added by me):

“Our program has been designed for sites with premium content. Sites that promote, contain, or link directly to the following types of content shall not be approved.

  • Adult, Pornographic or any illegal content
  • Tobacco, alcohol, ammunition, hazardous substances, illegal drugs, gore, violence, gambling and racism content
  • Pages containing profanity or content that and/or discriminates or is offensive to any section of people
  • Hate, violence, racial intolerance, or advocate against any individual, group, or organization
  • Sale of prescription drugs
  • Sale of counterfeit products, imitations of designer or other goods, stolen items or any products that infringe intellectual property rights of other parties
  • Contain programs which promote invalid click activity by paying users to clicking on ads, browse websites, read email etc.
  • Websites that contain forums, discussion boards, chat rooms, or any content area that is open to public updates without adequate moderation
  • Sites with content that has been generated using computer programs and hence may not be comprehendible.
  • Bulk of the content is user-generated
  • Sites with fake news
  • Any other content that we believe in our sole discretion to be illegal”

So their network is ABSOLUTELY incompatible with a search engine since it links to everything on that list.

Still in the running are Adsaro, Adsterra, Bidvertiser, Galaksion, and RevenueHits.

New Site Advertising Experiment

I’m trying another round of ad network experiments. These are the 6 companies I’m trying out:

Asdaro

Adsterra

Bidvertiser

Galaksion

Media.net

RevenueHits

I might also add Adcash to the mix if I can get their site verification to work – I signed up but was unable to verify my site because their system was unable to access the website (I’ll assume someone pushed some broken code before the new year).

I have a randomizer in my site template that picks a random ad provider on each page load. It should distribute the traffic roughly evenly among them and as I get data and experience how each company impacts my site, I’ll eliminate the ones that aren’t right for me.

Adsaro, Galaksion, and Media.net are all completely new to me.

I tried RevenueHits before and it it made the experience on my site pretty terrible and glitchy. I’m giving them another chance, but they’ll be the first to be cut out if I find anything annoying going on.

I tried Adsterra before and it worked OK, but earned effectively nothing. I’m giving them another chance since it’s been a couple years and things might work better.

I’ve used Bidvertiser and they were the ad provider that was on the last version of WbSrch the longest. I didn’t have any significant income through them, but their ads were the least intrusive and managed to not be annoying at all. I suspect this one will not be the first one to be cut.

Windows Software by Lambda Centauri

I’ve written a lot of apps for Windows (and other) PCs. Originally I published everything as Zeta Centauri, but it was a weird combination of audio apps and utilities that didn’t mesh well with audio apps (calculators, word processing, image viewer, browser). I’ve launched a new website for the utility apps to keep them separate from the audio apps.

Check it out here: https://lambdacentauri.com

WbSrch Online Again

I found a way to get WbSrch online inexpensively, through a combination of code optimizations and an inexpensive high-disk-space internet provider. It doesn’t need fast SSD storage to serve the index data, so it works just fine on a mechanical hard drive, and it’s easier to get a lot of space inexpensively with on of those. Through a bunch of memory and query optimizations, it’s more zippy now on an inexpensive VPS than it was on a 12-core server with 192GB of memory and 8 SSDs. For now I’m running the crawler and indexer from home and pushing index updates to the server as they’re done.

I’ve been using it as my main search engine even though the indexes are a bit out of date, and the results have better than I expect. It has definitely improved over the years.

Try it out:

https://wbsrch.com

Updating a wxWidgets project for Visual Studio 2019

I recently resurrected a dormant code project and went through the process of converting a wxWidgets 3.0 project to wxWidgets 3.1 and updaing from Visual Studio 2010 to Visual Studio 2019.

Include Directories

Here are the things I had to change to make things build and run:

Change “Platform Toolset” to Visual Studio 2019 in General configuration properties.

Change include and library directories from wxWidgets 3.0.2 to 3.1.4 in VC++ Directories and update the include path for modern Visual Studio. The change to $(IncludePath) does a lot of magic things that will save a lot of trouble. Failure to update that will cause common includes like stdafx.h to be missing.

Change include from:
E:\lib\wxWidgets-3.0.2\include;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSdkDir)include;$(FrameworkSDKDir)\include
to:
E:\lib\wxWidgets-3.1.4\include;$(IncludePath)

Code

The only code changes I had to make were to remove wxADJUST_MINSIZE anywhere it showed up.

Libraries

This is for the debug version of the project. Remove the “d” for libraries in the release version (i.e. wxbase31ud_core.lib => wxbase31u_core.lib).

These libraries showed up as missing:

comctl32.lib
rpcrt4.lib
uuid.lib
kernel32.lib

Adding C:\Program Files (x86)\Windows Kits\10\Lib\10.0.19041.0\um\x86 to the linker directories fixed this.

msvcprtd.lib

Adding C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.20.27508\lib\x86 to the linker directories fixed this.

ucrtd.lib

Adding C:\Program Files (x86)\Windows Kits\10\Lib\10.0.19041.0\ucrt\x86 to the linker directories fixed this.

wxregexu.lib

Adding that to the library list fixed it.

I suspected there was something similar to $(IncludePath) I could add to the library paths to make those resolve, but I wasn’t sure. So I tried $(LibraryPath). And it worked. Magic!

So do that instead of adding those individual directories.

Change library path from:
E:\lib\wxWidgets-3.0.2\include;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSdkDir)include;$(FrameworkSDKDir)\include
to:
E:\lib\wxWidgets-3.1.4\include;$(IncludePath)

Update all the libraries from wx 3.0 versions to wx 3.1 versions:

wxmsw30ud_core.lib => wxmsw31ud_core.lib
wxbase30ud.lib => wxbase31ud.lib
wxmsw30ud_adv.lib => wxmsw31ud_adv.lib
wxmsw30ud_html.lib => wxmsw31ud_html.lib
wxmsw30ud_xrc.lib => wxmsw31ud_xrc.lib
wxbase30ud_net.lib => wxbase31ud_net.lib
wxbase30ud_xml.lib => wxbase31ud_xml.lib

After these changes I was able to build and run my old project, which was originally written for wxWidgets 2.8 and then ported to wxWidgets 3.0.

New Version of the Vorbital Player Music Player

There’s a Windows music app I’ve been maintaining off and on over the last 10 years or so. Today I released an update with some significant user interface improvements.

The idea behind the Vorbital Player is to have a simple and uncluttered interface that just plays music and audio files and doesn’t try to manage your music library.

You can get it here:

https://zetacentauri.com/software_vorbital.htm

Vintage Stock Certificates 1990-1997

Belding Heminway 100 Shares 1990

Belding Heminway 100 Shares 1990s image

First Charlotte Bank and Trust Company 16 Shares 1990

First Charlotte Bank 16 Shares 1990s image

First Virginia Banks 100 Shares 1992

First Virginia Banks 100 Shares 1990s image

Peak Technologies Group 100 Shares 1992

Peak Technologies 100 Shares 1990s image

Sun Distributors 10 Shares 1992

Sun Distributors 10 Shares 1990s image

Systems and Computer Technology 300 Shares 1992

Systems and Computer Technology 300 Shares 1990s image

Dollar General Corporation 9 Shares 1993

Dollar General 9 Shares 1990s image

First Bank of Philadelphia 23 Shares 1993

First Bank of Philadelphia 23 Shares 1990s image

North European Oil Royalty Trust 300 Shares 1993

North European Oil Royalty Trust 300 Shares 1990s image

Twin City Bancorp 100 Shares 1995

Twin City Bancorp 100 Shares 1990s image

Glenborough Realty Trust Incorporated 161 Shares 1996

Glenborough Realty 161 Shares 1990s image

Signature Resorts 19743 Shares 1997

Signature Resorts 19743 Shares 1990s image

Vintage Stock Certificates 1980-1989

Seal Fleet 100 Shares 1980

Seal Fleet 100 Shares 1980s image

The Penn Central Corporation 2 Shares 1988

Penn Central 2 Shares 1980s image

Westcoast Transmission Company 100 Shares 1982

Westcoast Transmission Company 100 Shares 1980s image

Avery International 33333 Shares 1983

Avery International 33333 Shares 1980s image

Ronson Corpration 100 Shares 1983

Ronson 100 Shares 1980s image

First Charlotte Bank and Trust Company 125 Shares 1984

First Charlotte Bank 125 Shares 1980s image

United Merchants and Manufacturers 400 Shares 1984

United Merchants and Manufacturers 400 Shares 1980s image

Meredith Corporation 200 Shares 1985

Meredith 200 Shares 1980s image

Barnett Banks 3742 Shares 1987

Barnett Banks 3742 Shares 1980s image

Belding Heminway Company 2 Shares 1987

Belding Heminway 2 Shares 1980s image

Heck’s 12 Shares 1987

Heck's 12 Shares 1980s image

Standard Federal Bank 1000 Shares 1988

Standard Federal Bank 1000 Shares 1980s image

Buttes Gas and Oil 979 Shares 1989

Buttes Gas and Oil 979 Shares 1980s image

PAR Technology Corporation 1200 Shares 1989

PAR Technology 1200 Shares 1980s image

Ronson Corporation 46 Shares 1989

Ronson 46 Shares 1980s image

Vintage Stock Certificates 1975-1979

Great Northern Nekoosa 25 Shares 1975

Great Northern Nekoosa 25 Shares 1970s image

Pan American World Airways 100 Shares 1975

Pan American World Airways 100 Shares 1970s image

Pullman Incorporated 1 Share 1975

Pullman 1 Share 1970's image

Refac Technology Development Corporation 1 Share 1975

Refac Technology 1 Share 1970s image

Industrial Electronic Hardware 100 Shares 1976

Industrial Electronic Hardware 100 Shares 1970s image

North European Oil Royalty Trust 100 Shares 1976

North European Oil Royalty Trust 100 Shares 1970s image

San Juan Racing Association 1237 Shares 1976

San Juan Racing Association 1237 Shares 1970s image

UV Industries 1310 Shares 1977

UV Industries 1310 Shares 1970s image

Caesars World 20 Shares 1978

Caesars World 20 Shares 1970s image

Columbus and Southern Ohio Electric Company 100 Preferred Shares 1978

Columbus and Southern Ohio Electric 100 Preferred Shares 1970s image

Columbus and Southern Ohio Electric 1350 Preferred Shares 1978

Columbus and Southern Ohio Electric 1350 Preferred Shares 1970s image

United Technologies Corporation 30 Shares 1978

United Technologies 30 Shares 1970s image

Caesars World 54 Shares 1979

Caesars World 54 Shares 1970s image

Union Corporation 1 Share 1979

Union Corporation 1 Share 1970s image

UV Industries 500 Shares 1979

UV Industries 500 Shares 1970s image

Vintage Stock Certificates 1971-1974

American Brands 100 Shares 1971

American Brands 100 Shares 1970s image

Pan American Sulphur Company 400 Shares 1971

Pan American Sulphur 400 Shares 1970s image

Straus-Duparquet 1000 Shares 1971

Straus-Duparquet 1000 Shares 1970s image

Studebaker-Worthington 5 Shares 1971

Studebaker-Worthington 5 Shares 1970s image

Unimed 25 Shares 1971

Unimed 25 Shares 1970s image

United Board and Carton Corporation 100 Shares 1971

United Board and Carton 100 Shares 1970s image

Baldwin Securities 109 Shares 1972

Baldwin Securities 109 Shares 1970s image

United Merchants and Manufacturers 100 Shares 1972

United Merchants and Manufacturers 100 Shares 1970s image

Belknap 5 Shares 1973

Belknap 5 Shares 1970s image

Fashion Fabrics 5 Shares 1973

Fashion Fabrics 5 Shares 1970s image

Major Realty 100 Shares 1973

Major Realty 100 Shares 1970s image

Union Corporation 500 Shares 1973

Union Corporation 500 Shares 1970s image

Unisystems 100 Shares 1973

Unisystems 100 Shares 1970s image

Lightolier 100 Shares 1974

Lightolier 100 Shares 1970s image