The WbSrch Experiment

Off-and-on over the last 8 years I’ve worked on an independent search engine called WbSrch. It made it as far as being as good as the late-1990s search engines, which is great, because the original goal was to build something much like Altavista. That was my first “main” search engine.

At one point I tried to turn it into a real business. That went poorly and I shut it down. Then I brought it back to work on as a hobby/fun project. That was interesting and fun for a while, but it’s run its course. I’ve done all the things I set out to do and learned all the things I wanted to learn. I’ve had my fun, so there’s no need to tinker with web search anymore. It did keep me busy toward the end of the pandemic as I was starting to go stir crazy, and I’m grateful for that.

If you’d like to see what it looked like when I finished with it, take a look at this capture on archive.org.

If you’d like to use a pretty good alternative search engine, I suggest Mojeek or Yandex. The MusicSrch music search engine is still going, too.

And if you’d like to get a copy of some of the data I collected, there are a few inexpensive data downloads available.

Now that you’re here, feel free to explore the blog a bit. I have a bunch of websites and music projects I’ve created, and you might find some of them interesting (under the “My Stuff” section of the sidebar).

PoSSE and Facebook

One core idea of the “Indie Web” is “Publish on Your Own Site, Syndicate Elsewhere” (PoSSE). The idea is that you post content on your own website first and foremost, and then mirror it to social networks such as Facebook. This gives you more control over the original content, keeping it from being hidden behind a walled garden and preventing it from disappearing if you are banned from a site, it shuts down, the algorithms decide you’re not interesting, or it just decides to hide things older than X years.

It’s a good idea, and I think I’ll be implementing it a bit more in my own life. Don’t be surprised if you see more posts showing up and backfilling the site with non-recent publication dates. Most of my activity is on Facebook, but there is a little on Instagram, and even less on Twitter.

The one obvious drawback to publishing things publicly on your own site is that it lacks visibility controls like “friends only”, which is valuable, but not foolproof because anyone can screenshot and forward anything. it does help keep down the number of randos sea-lioning into your conversations.

Since this blog intentionally does not allow comments, there’s little worry about that. There is still a little privacy concern, but as an Extremely Online Person, I don’t care much about privacy and everything is pretty much out there anyway.

Having Too Much Stuff

A lot of people, myself included, battle the accumulation of excess things. There’s a meme about minimalism vs. hoarding out there that goes into it, essentially that growing up without much makes it hard for people to let go of things they don’t necessarily need. I am not from a rich family.



A lot of people like to hate on minimalism because it’s “Just Another Boring Product Wealthy People Can Buy“, which the original post mentioned later in the thread. If that’s how you feel, by all means, live your life how you want. I certainly agree that white-on-white as an aesthetic is disgusting, and it’s why most American houses and apartments are hideously boring places I wouldn’t want to live. But to me a much worse aesthetic is “piles of stuff everywhere”. It’s visual distortion, and being in an ugly environment negatively affects my mood if I actually look at it.

The biggest trouble for me is that “the things you own end up owning you”. The more stuff you have, the less freedom you have, in more than one way. It could be less freedom to live in a small space, less freedom to get a piece of furniture someone’s giving away because it just wouldn’t fit, or less freedom to move to a new place because you’d need 8 days and a 60-foot moving truck to haul everything. It’s also less financial freedom because you’ve bought too much stuff, are spending a good chunk of your paycheck on a huge storage unit, or had to pay the movers for an extra 4 hours to get everything loaded.

If everything you own fits in your vehicle, it’s a whole lot easier to go live in a different city if you want to. More freedom.

It’s really hard to unlearn the collecting of things. A shelf full of books that you might read one day is easy to keep when the alternative is to get rid of them only to need to buy one of them again two years later because you suddenly got really interested in the topic. It’s hard to know whether keeping stuff or getting rid of stuff will be more expensive.

I think a good way to think about it is whether you would replace a particular thing if there was a catastrophic fire that destroyed everything you own. If the answer is yes, then by all means keep it. If the answer is no, maybe think about getting rid of it. There’s a lot of “I don’t know” gray area in that, but it’s a decent guideline.

“I might need this some day” is a cursed phrase.

For most objects, they’re just things and they don’t matter. The only real Human needs are food, shelter, a powerful laptop, and a good internet connection.

Removing Politics From Twitter

My disdain for Twitter is no secret. It is a cesspool of the worst people on Earth. But it does have some redeeming qualities if you can manage to filter out all the political nonsense

Here’s how I filter out most of the crap (there are a few more that go off the screen, but not that many).

I should really turn off trends, but instead I either click “Not interested in the topic” or “This trend is harmful or spammy” when I see anything political. Anecdotally, clicking “not interested” seems to have more effect. I also not-interested sports topics since I’m genuinely not interested in any sports. They don’t make me angry, though.

I also block everyone who looks even remotely annoying and have built a block list of around 1000 people over the past 10 years or so. My block list is insane and is about 90% MAGA idiots (and there seems to be a deep supply of them) and about 10% always-outraged liberals. Most of the MAGA scum on Twitter are either bots or morons who are indistinguishable from bots. This does of course mean that I’m missing out on the finer details of the United States’ inevitable descent into totalitarian fascism, which is a real loss.

All in all, it is a LOT of effort to de-politicize your Twitter feed, and it’s probably not worth it. If Twitter had any sense, which they don’t, they’d add an option to filter out political nonsense. I think they know that if they added that option, there would be almost nothing left and most of the wingnuts would leave, destroying their monthly active user numbers. So, instead of making it a decent place where you can find useful information, they made it a place full of angry assholes always getting angrier about things. That’s the thing with social media — the algorithms LOVE to keep people outraged and angry because that results in more eyeballs-glued-to-the-site time.

My feed is for the most part now a mix of cute capybara pictures, 3D art, and pictures of Spain. You should probably follow CAPYBARA_MAN.

Or just don’t bother wasting your time with Twitter. That’s always an option. Fear not, you’re missing out on nothing.

My Musical Hiatus 2003-2015

I didn’t release much in the way of original music from 2003 to 2015. There was just the Positronic Empire album, which is more of an EP, and the Agaritine album, which is all software-created remixes.

I was focused on other things. From 2003-2005, pretty much all of my time was focused on finishing college. After that, I spent a decade establishing myself in a career in software, going from total n00b to a manager with a team of 5. That didn’t leave much time or motivation for music.

Most of the drum beats for 2008’s Positronic Empire were written in 2005 just as I was finishing school. I dug them out a few years later and finished them during a week of vacation. In 2012, the remix album Agaritine was created using the Echo Nest Remix API (which is now called Amen).

It wasn’t until 2015, after having built one of everything software-wise and having founded multiple startups, that I got back into music in a big way. Since then I released 13 albums as Bloodless Mushroom, three as Toilet Duck Hunt, an album and an EP as OJ Champagne, and a handful of singles as Rain Without End, and an EP with Sasha and The Children, a band that I performed live with for about a year.

In the middle of the pandemic, I got burnt out on music and didn’t really have any creativity flowing. I think creativity requires new experiences, and lockdown turned the new experiences knob down to zero. I’m only just now getting back into it as things have opened back up and will probably release another album this year. I don’t know if I have much more in me after that. I’m also sensing (and planning) a new wave of major life change, which may or may not bring musical creativity along with it.

Whether I do or don’t create more music isn’t particularly important. I’ve done a lot. I’ve released more music than most professionals do in their lifetime. I doubt that I’ll be remembered for my music, but that’s OK. I’ve only ever made it for myself and for my creative enjoyment.

Gear Hoarding

I’m thinking back to the time when I bought my first piece of music gear on eBay in 2001. It was a Yamaha TX81Z FM rack synth module, fairly beat up. It had a lot of cheesy, useless-sounding patches and a few really nice ones. I had a Yamaha DJX keyboard that my mom had bought me the previous Christmas. Together, with those two pieces of gear, I wrote Forest of Worlds at the house at 4026 Westway in Toledo. I didn’t have much other gear, just a bass guitar (I think it was an Ibanez Ergodyne EDC705, but it might have been something else) and an electric guitar, a modified Peavey Predator with a multi-effect pedal. There was really nothing I couldn’t do with that gear given enough talent/skill. Which I didn’t have yet.

Before Forest of Worlds, I had never written any music using the keyboard. Sure, there was a track where the DJX was playing drum sounds on my first album, but that wasn’t keyboard music. Before that, I had only written tracker, fractal, and guitar tunes. While it opened a whole new world of synthesizer music and spawned some beautiful-to-me songs like Trepidation, Encounters, Gliding, Cosmic Serenade, Quelet, Montagne, and others (in spite of the core of Bloodless Mushroom being a mix of fractal and tracker tunes), it also created a monster. From that moment on I started hoarding gear, collecting things less because they served a useful purpose and more because I could. I wanted to have every possible sound at my fingertips. I wanted to experience and explore everything out there in the world. And I pretty much did.

While the time spent playing and practicing made me a better musician, the gear hoarding did not. In fact, it actively detracted from my musicianship. I spent too much time fiddling with gear, noodling, and just shuffling things around, and not enough time practicing and writing music. I did create the SoundProgramming website from my explorations, which has helped a lot of people explore gear and get manuals for it, so it wasn’t all wasted effort.

Now I have every sound imaginable at my fingertips. I have so much software and so many libraries that there’s nothing I can’t do electronically (my sample library is more than 600 gigabytes). Since Bloodless Mushroom was always more of a tracker-and-fractal project, I never needed anything more than a laptop to write music in the first place. I certainly don’t need a whole room full of gear. In fact, the more in-the-box I work, the more creative I seem to be.

Just give me a keyboard (with MIDI). Practically any keyboard will do, but full-size keys help. Just give me a bass guitar and regular guitar and a cord to connect them with. The make and model doesn’t even matter, as long as they stay in tune. I do not need more gear than I can carry on my back. Well, as long as I’m not playing/writing drums. A real electronic or physical kit won’t fit on my back.

Proof of this just-plug-something-in-and-go is in the Rain Without End songs. They’re really just me multitracking guitar and bass. And it sounds good. Not perfect by any means, but I can put together nice-sounding ideas that people enjoy.

I must confess that using three GM-capable synths like I did for the Gymnopus album sure does sound good, though. All that can be achieved in software like Kontakt, of course. It just requires more detail work. If I do that work, the quality will be far beyond anything I could get with a 17-year-old hardware module.

What I’m trying to say is this: I don’t need to take any of this stuff with me. I can get what I need wherever I am, and I don’t need much.

RevenueHits Turned Out To Be Entirely Worthless

I’ve been running RevenueHits on one of my websites for about the last 5 months in a rotation with other ad networks. The site doesn’t get a huge amount of traffic, and it’s global, so I don’t expect massive returns. But I didn’t expect zero.

Here are my stats for May 2021.

And here are my stats for all of 2021:

For all of 2021, here are my top geographies:

A large percentage of my traffic comes rom India, Pakistan, and Bangladesh, so I shouldn’t be getting $100 CPMs.

However, anything less than a tenth of a cent per click or a $0.01 CPM is unacceptable and pointless for even the most low-paying geographic locations. I know those 102 clicks earned them more than $0.00.

I’ve removed RevenueHits from my site since they couldn’t even be bothered to send me a token fraction-of-a-penny like most barely-legit ad networks would have.

Bidvertiser was also running as one of the ad networks in rotation alongside RevenueHits and did generate earnings during this same time period. It’s the only ad network that I can trust to generate earnings from global traffic and so far they’ve proven themselves as the best Adsense alternative, which is why they’re the only provider left standing. There might be somewhere that would earn more, but doing so without shady popover/popunder ads, spammy push requests, interstitials, or any of the shady scammy tactics that make the internet almost unusable would be pretty difficult.

At some point I might run the which-ad-network-do-I-use experiment again, but it might be just about as much work to create one. It probably wouldn’t earn less.

Smarthost.net: Ultimately Just a Waste of Time

Smarthost.net has some nice “storage server” deals with some very configurable options. If you want a VPS with 1TB of disk space, their offerings are pretty attractive.

For my search engine, I need hosting with a good chunk of disk space in order to hold the index. It doesn’t need to be fast storage, and it doesn’t need a lot of CPU and RAM — retrieval of index entries is fast and efficient.

This made Smarthost look pretty ideal, so I signed up and got a web server going. It worked well for about a month. So I decided to set up a second one to do some light web crawling (you don’t get enough cores on their plans to do anything heavy).

After about a week, both machines were unreachable. I contacted support and found out that the drive array had failed on the machine hosting both of them. Support tried to recover it, but ultimately it was a total failure. So the little bit of web crawling data and the search engine log data for about 2 weeks (since the last time I pulled it) was destroyed.

Annoying, but hardware failures happen.

A week later I found my crawler machine suspended because of a false positive on Spamhaus. Apparently their system is so badly-written that just visiting a domain with a web crawler can get you on a “bad list” for supposedly hosting a virus/malware. Many hosting providers, Smarthost included, will auto-suspend service for any box that gets on that list.

I got that machine removed from Spamhaus the same day and had it reactivated a few hours later to download the 200k pages or so that had been crawled, but support was pretty snarky about it. Clearly Smarthost is not a service that is compatible with what I do.

I ended up moving the web server to 1tbvps, which is slightly more expensive, but has more CPU cores and RAM, which is always nice. I moved the crawler to Digital Ocean, which is a very data-science-friendly service. We’ll see whether I have issues with those, but I suspect they will work better for my purposes.

Ultimately my 2 month experience with Smarthost ended up being a complete waste of time.

 

New Web Browser: Scleroglossa

For quite a while I’ve wanted to build a web browser based on the Gecko engine by Mozilla, which is what powers Firefox. Until recently I never had the right combination of time and motivation to dig in.

Well, now that I have, here’s the result – the Scleroglossa browser for Windows.

It’s available for download on the Lambda Centauri website.

Cleaner URLs Without Tracking Nonsense

Have you ever seen a link with a bunch of extra stuff on it? Facebook URLS with “fbclid=<big string of letters” or links with a bunch of “utm_medium=<whatever>” or those horrendously long product links you get from Amazon?

They’re used for tracking behavior, and handy for people getting marketing and attribution data. If you don’t mind them, that’s cool. They annoy me a little because I like clean, readable URLs.

There’s a browser extension to get rid of them, called ClearURLs:

Chrome: https://chrome.google.com/webstore/detail/clearurls/lckanjgmijmafbedllaakclkaicjfmnk/related

Firefox: https://addons.mozilla.org/en-US/firefox/addon/clearurls/

I Don’t Care About Cookies

I’m tired of websites showing me cookie warnings that I have to click through to remove some sort of overlay that obscures some portion of the site. I have not nor will I ever care about cookies. They’re a built-in part of the browser that should just work invisibly, and they’re an important part of making apps work.

There’s an extension that’s called, appropriately, “I Don’t Care About Cookies”. Here it is:

Firefox: https://addons.mozilla.org/en-US/firefox/addon/i-dont-care-about-cookies/

Chrome: https://chrome.google.com/webstore/detail/i-dont-care-about-cookies/fihnjjcciajhdojfnbdddfaoknhalnja

Galaksion Didn’t Work For Me

I’ve removed Galaksion from the add networks being used on my site.

It’s a shame that I had to, because I had a generally good feeling about them. I liked their website, their publisher dashboard, and how easy it was to sign up and set up my site to use them.

Their ad fill rate was not great, but I don’t expect that of a new site that hasn’t been sent much traffic yet. The ads that did fill at least had a non-zero CPM rate.

But there’s one thing that happened that I can’t live with.

While testing some updates to my site I clicked on a text box to edit some details. A new tab automatically opened to some ad-based destination. I did not click on an ad to get there. I might have inaccurately attributed an auto-navigation to AdsTerra a few days ago, but they would have been removed anyway due to their slow script and poor reputation and I don’t care to do the research to figure out which of the two was the real culprit in that particular malvertising navigation (that was probably AdsTerra, I’m just not 100% confident).

Anything that will clickjack or hinder my site’s functionality is not welcome, so they’ve been removed from the rotation. Now we’re down to just RevenueHits and Bidvertiser.

I really wish this process involved me removing providers because I didn’t earn as much with them rather than dealing with technical issues that break my site. How is above-ground malvertising and clickjacking even still a thing in 2020? I should have to scour the dark web to find ad networks that behave that way.

Adsterra Didn’t Work For Me

I noticed that pages on my site would hang for a WHILE, so I did some investigation.

It turns out that the Adsterra JavaScript code was loading https://www.gatetodisplaycontent.com/7cb8fde86d9eb121ba106553cdc48d1a/invoke.js which would take more than a minute to finish (1.1 minutes according to the browser’s developer tools). It was a blocking call, so my site would not load while this script was executing.

Since breaking my site is unacceptable, Adsterra has been kicked out of the rotation.

That extremely long load time might also explain the terrible fill rate.

Adsterra Statistics

Bidvertiser served 3 times as many impressions on the same site over the same time period and did not prevent my site from loading.

I also noticed some weird behavior and could not easily figure out which script was causing it. Once in a while when I would click in a text box, a new tab would open to product.directredirection.com with a spammy malware site that was trying to trick me into installing a Chrome extension.

The URL:

directredirection.com malware

The page:

product.direcredirection chrome extension malvertising

More investigation revealed that Adsterra has been known to serve malvertising in the past:

https://blog.malwarebytes.com/cybercrime/2016/04/magnitude-ek-activity-at-its-highest-via-adsterra-malvertising/

https://searchsecurity.techtarget.com/news/252473229/Adsterra-still-connected-to-malvertising-campaign-despite-denials

I removed their JavaScript snippet and haven’t had that malware auto-navigation happen again.

Now it’s down to Bidvertiser, Galaksion, and RevenueHits.

Adsaro Didn’t Work For Me

I’m not sure that Adsaro even works. In the past few days I’ve had zero impressions. Maybe I set up their JavaScript block wrong, but the fact that pages on their site take an eternity to load makes me think it’s them, not me.

Adsterra, Bidvertiser, Galaksion, and RevenueHits are still in the running.

Media.net Didn’t Work For Me

I received a message that my site was disapproved today, so they’re the first ones to fail out of my newest ad network comparison experiment.

Looking closer at their policies I see this (bold added by me):

“Our program has been designed for sites with premium content. Sites that promote, contain, or link directly to the following types of content shall not be approved.

  • Adult, Pornographic or any illegal content
  • Tobacco, alcohol, ammunition, hazardous substances, illegal drugs, gore, violence, gambling and racism content
  • Pages containing profanity or content that and/or discriminates or is offensive to any section of people
  • Hate, violence, racial intolerance, or advocate against any individual, group, or organization
  • Sale of prescription drugs
  • Sale of counterfeit products, imitations of designer or other goods, stolen items or any products that infringe intellectual property rights of other parties
  • Contain programs which promote invalid click activity by paying users to clicking on ads, browse websites, read email etc.
  • Websites that contain forums, discussion boards, chat rooms, or any content area that is open to public updates without adequate moderation
  • Sites with content that has been generated using computer programs and hence may not be comprehendible.
  • Bulk of the content is user-generated
  • Sites with fake news
  • Any other content that we believe in our sole discretion to be illegal”

So their network is ABSOLUTELY incompatible with a search engine since it links to everything on that list.

Still in the running are Adsaro, Adsterra, Bidvertiser, Galaksion, and RevenueHits.

New Site Advertising Experiment

I’m trying another round of ad network experiments. These are the 6 companies I’m trying out:

Asdaro

Adsterra

Bidvertiser

Galaksion

Media.net

RevenueHits

I might also add Adcash to the mix if I can get their site verification to work – I signed up but was unable to verify my site because their system was unable to access the website (I’ll assume someone pushed some broken code before the new year).

I have a randomizer in my site template that picks a random ad provider on each page load. It should distribute the traffic roughly evenly among them and as I get data and experience how each company impacts my site, I’ll eliminate the ones that aren’t right for me.

Adsaro, Galaksion, and Media.net are all completely new to me.

I tried RevenueHits before and it it made the experience on my site pretty terrible and glitchy. I’m giving them another chance, but they’ll be the first to be cut out if I find anything annoying going on.

I tried Adsterra before and it worked OK, but earned effectively nothing. I’m giving them another chance since it’s been a couple years and things might work better.

I’ve used Bidvertiser and they were the ad provider that was on the last version of WbSrch the longest. I didn’t have any significant income through them, but their ads were the least intrusive and managed to not be annoying at all. I suspect this one will not be the first one to be cut.

Windows Software by Lambda Centauri

I’ve written a lot of apps for Windows (and other) PCs. Originally I published everything as Zeta Centauri, but it was a weird combination of audio apps and utilities that didn’t mesh well with audio apps (calculators, word processing, image viewer, browser). I’ve launched a new website for the utility apps to keep them separate from the audio apps.

Check it out here: https://lambdacentauri.com

WbSrch Online Again

I found a way to get WbSrch online inexpensively, through a combination of code optimizations and an inexpensive high-disk-space internet provider. It doesn’t need fast SSD storage to serve the index data, so it works just fine on a mechanical hard drive, and it’s easier to get a lot of space inexpensively with on of those. Through a bunch of memory and query optimizations, it’s more zippy now on an inexpensive VPS than it was on a 12-core server with 192GB of memory and 8 SSDs. For now I’m running the crawler and indexer from home and pushing index updates to the server as they’re done.

I’ve been using it as my main search engine even though the indexes are a bit out of date, and the results have better than I expect. It has definitely improved over the years.

Try it out:

https://wbsrch.com

Updating a wxWidgets project for Visual Studio 2019

I recently resurrected a dormant code project and went through the process of converting a wxWidgets 3.0 project to wxWidgets 3.1 and updaing from Visual Studio 2010 to Visual Studio 2019.

Include Directories

Here are the things I had to change to make things build and run:

Change “Platform Toolset” to Visual Studio 2019 in General configuration properties.

Change include and library directories from wxWidgets 3.0.2 to 3.1.4 in VC++ Directories and update the include path for modern Visual Studio. The change to $(IncludePath) does a lot of magic things that will save a lot of trouble. Failure to update that will cause common includes like stdafx.h to be missing.

Change include from:
E:\lib\wxWidgets-3.0.2\include;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSdkDir)include;$(FrameworkSDKDir)\include
to:
E:\lib\wxWidgets-3.1.4\include;$(IncludePath)

Code

The only code changes I had to make were to remove wxADJUST_MINSIZE anywhere it showed up.

Libraries

This is for the debug version of the project. Remove the “d” for libraries in the release version (i.e. wxbase31ud_core.lib => wxbase31u_core.lib).

These libraries showed up as missing:

comctl32.lib
rpcrt4.lib
uuid.lib
kernel32.lib

Adding C:\Program Files (x86)\Windows Kits\10\Lib\10.0.19041.0\um\x86 to the linker directories fixed this.

msvcprtd.lib

Adding C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.20.27508\lib\x86 to the linker directories fixed this.

ucrtd.lib

Adding C:\Program Files (x86)\Windows Kits\10\Lib\10.0.19041.0\ucrt\x86 to the linker directories fixed this.

wxregexu.lib

Adding that to the library list fixed it.

I suspected there was something similar to $(IncludePath) I could add to the library paths to make those resolve, but I wasn’t sure. So I tried $(LibraryPath). And it worked. Magic!

So do that instead of adding those individual directories.

Change library path from:
E:\lib\wxWidgets-3.0.2\include;$(VCInstallDir)include;$(VCInstallDir)atlmfc\include;$(WindowsSdkDir)include;$(FrameworkSDKDir)\include
to:
E:\lib\wxWidgets-3.1.4\include;$(IncludePath)

Update all the libraries from wx 3.0 versions to wx 3.1 versions:

wxmsw30ud_core.lib => wxmsw31ud_core.lib
wxbase30ud.lib => wxbase31ud.lib
wxmsw30ud_adv.lib => wxmsw31ud_adv.lib
wxmsw30ud_html.lib => wxmsw31ud_html.lib
wxmsw30ud_xrc.lib => wxmsw31ud_xrc.lib
wxbase30ud_net.lib => wxbase31ud_net.lib
wxbase30ud_xml.lib => wxbase31ud_xml.lib

After these changes I was able to build and run my old project, which was originally written for wxWidgets 2.8 and then ported to wxWidgets 3.0.

New Version of the Vorbital Player Music Player

There’s a Windows music app I’ve been maintaining off and on over the last 10 years or so. Today I released an update with some significant user interface improvements.

The idea behind the Vorbital Player is to have a simple and uncluttered interface that just plays music and audio files and doesn’t try to manage your music library.

You can get it here:

https://zetacentauri.com/software_vorbital.htm