Part 4 - Specific vs Generic

Started by CultLeader, June 02, 2021, 08:09:27 AM

Previous topic - Next topic

CultLeader

Other cornerstone of logic that is needed for the software development with the pattern or even in life in general is that specific solutions will always be superior to generic solutions.

For instance, Clickhouse will always be faster than Postgres for analytical workloads. All database was optimized to do was to utilize all cores and process tons of data fast. This fact has lots of ramifications throughout all logical levels of Clickhouse design choices:

  • Data is stored in columnar format
  • All cores are utilized for queries (few users assumed)
  • Data is sorted and compressed to get max read throughput
  • There are no real transactions, append only workload is assumed, hard to delete rows
  • Inserts go straight to disk without buffering, hence, user has to ensure it batches inserts for max throughput

This is radically different from Postgres, that aims to be OLTP database from which you can serve web application and have ACID transactions across multiple tables. It is made to handle many concurrent queries at once and uses MVCC so each row would have its lifetime start and lifetime end and readers wouldn't block writers. Also, it has WAL logging, to prevent losing data after changes. Data is stored in rows so deletes and inserts would be efficient.

There is no query you could not execute on Postgres that you could execute in Clickhouse. And Postgres provides stronger guarantees against data corruption and ensures correctness. However, since Clickhouse solves more specific problem, of data warehousing that is, it can avoid implementing all the complexities that Postgres implements to ensure guarantees it provides. Row deletion? Doesn't really matter in Clickhouse, hence, you can always quickly append more data to the table. Transactions? Clickhouse is not really interested in that as you can insert duplicates if you want and Clickhouse can merge and remove dupes in the background. Cores? we assume that there are few users (typically analysts) querying the data so we can give all the cores server has, which we can't do in Postgres, since Postgres assumes there are many requests happening to the database at once.

Postgres is great database and for OLTP workloads that it handles it is a perfect fit. But it will never be as good as Clickhouse for OLAP workloads because Clickhouse specializes in that. Likewise, if you want to keep your sanity developing web application, you would never use Clickhouse to store realtime user data - it is not fit for that either.

Both databases have their place under the sun. And I sound like a typical leftist cuck saying that, so, I'll say another thing.

There's Apache Druid that tries to do the same as Clickhouse and is an utter failure, just like typical Java shit under the sun. Yet some idiot writes a blogpost https://leventov.medium.com/comparison-of-the-open-source-olap-systems-for-big-data-clickhouse-druid-and-pinot-8e042a5ed1c7 claiming that:

QuoteFor a wide range of applications, neither ClickHouse nor Druid or Pinot are obvious winners. First and foremost, I recommend to take into account, the source code of which system your are able to understand, fix bugs, add features, etc. The section "On Performance Comparisons and Choice of the System" discusses this more.

Clickhouse is not an obvious winner against Druid or Pinot? What kind of smokepipe is this idiot smoking? Its simpler, faster, and doesn't eat infinite memory for simple tasks like typical Java shit and is not complex to deploy. Obviously, doing more with less servers and simpler maintenance is cheaper so Cloudflare went with Clickhouse for obvious reasons https://blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-queries-per-second/ .

So, there are few niches of components, like OLAP database or OLTP database and I have one and only one choice for these, Clickhouse for OLAP and Postgres for OLTP. I do not drown into pagan delusion that all these competitors in each categories are best in their own right. I mean, is Clickhouse version 7 better than version 6? If you say yes, you admit one is better than another. Just in versionings of the same component. What about different components? They can never be equal either and one will be better than all others. It can't be any other way. And ideally you'd want to find that component.

Software Development

How does this tie to software development?

I see people from OOP background abusing interfaces. I did this myself as a greenhorn in my career. Think of some abstraction then think of its implementors. What inevitably happens is system becomes full of spaghetti code and confusion. What happens 99% of the time is that you implement some new implementation and old interface is not enough, you must refactor it. Then you add method to implementation, and it ripples through the rest of the implementors and you usually implement dummy method there. What a waste.

As I got older I started appreciating ifs, elses and matches way more than I appreaciate interfaces to solve different flows problem. Under an if branch you can make a simple assumption that all the flow under branch is consistent, and is specific to do one thing, with all the assumptions. Hence, there is no nasty intertwined logic between different object implementations and code is simpler.

I've been doing OCaml for n years now, name stands for Objective Caml - and I never implemented a real object. Just not needed. Structures and functions is to this day all I use in OCaml. I got out of using objects even before I found OCaml and I never missed them ever since.

OOP pattern vs specific ifs

Say you have a task to query prices from different markets from different bitcoin exchanges. Well, the API's are quite different on most of them.

Say, A exchange lets you query prices directly with API key without creating a session. B exchange, on the other hand, flow first requires you to get a session key and only then you can query prices.

Typical OOP way would be for Exchange A:


interface PriceApi {
  double getPrice();
}

class ExchangeA implements PriceApi {
  double getPrice() {
    ...
    return thePrice;
  }
}

...

double queryPrice(PriceApi api) {
  return api.getPrice();
}


I added queryPrice flow that ties it all together, just with the implementation.

Okay, simple API? For the exchange A that is. Now we add exchange B, that needs session key:


interface PriceApi {
  void initSession();
  double getPrice();
}

class ExchangeA implements PriceApi {
  void initSession() {}

  double getPrice() {
    ...
    return thePrice;
  }
}

class ExchangeB implements PriceApi {
  void initSession() {
    // do some magicka to initiate the session
    ...
  }

  double getPrice() {
    ...
    return thePrice;
  }
}

...

double queryPrice(PriceApi api) {
  api.initSession();
  return api.getPrice();
}


Look, we had to accomodate for the init session flow, so, we had to change:

  • The interface
  • The exchange A implementor
  • The flow that ties them all together

We added one more exchange and we had to change three places!

What I now rather do with OCaml, is much simpler:


type exchange =
  | ExchangeA
  | ExchangeB

let query_price =
  function
  | ExchangeA -> (
      exchange_a_get_price ()
    )
  | ExchangeB -> (
      exchange_b_init_session ();
      exchange_b_get_price ()
    )


Exchange flows are separate and matched. Also, if they had arguments you could encode them in the enum. ExchangeB code does separately all it needs and there is no confusion or need to change its implementation. This is sanity. This is specific. Java interfaces are generic and are growing like blob of insanity with more added implementations. To write perfect interface from scratch you have to have hindsight in all the use cases, which is practically impossible.

And I'm sure, someone will say "well, that is a bad example, you could have solved this problem with X, Y, Z in Java", to which I reply - every Java codebase is a bad example. I've never seen sane Java project, I've browsed hadoop, kafka, zookeeper and plenty of other repositories I forgot and its always the same:

  • Thousands of interfaces
  • Thousands of implementations
  • Indirect call stacks so deep, you have nightmares about developing with that code
  • 99% of interface implementation methods are just indirectly calling yet another implementation

People who develop Java are seriously sick in the head. These people need help. They cannot write maintainable code. This is how unhealthy cults start and is perfect job security. Instead of writing simple ifs and elses they by default go for interfaces. Utter insanity.

Real life examples

Consider a car, say, a lambo. Lambo is optimized throughout all levels to go fast and look good. Carbon fiber, low body, hard tyres, aerodynamics and all that stuff. Not much room for cargo space, engine in the back.

Or, consider Mercedes G class. Car that is made to drive through offroad. It has high body, soft suspension, usually soft tires, zero interest in aerodynamics. Entirely different profile of the car.

There are unlikely any parts that could be compatible between two cars and all is different.

This is the generic car that typical Java developer would make:

  • Generic body
  • Generic wheels
  • Generic engine

Want this generic car to be aerodynamic? Tough, you cannot, generic body doesn't support that and bending its body requires to adjust other parts and their defined spaces.
Want this generic car to have a very powerful engine? Tough, you cannot, generic car body only has space for a weak small engine.
Want this generic car to have very large wheels to go offroad? Tough, you cannot, the body cannot fit those wheels.

See where I'm getting at? You cannot make generic shitty car perfect fit for any specific purpose. You have to design car already to be aligned with the purpose and optimize all parts and fit them appropriately. You cannot build a lambo and put 1.5L petrol engine in it - such lambo will not be fast and lose entire point and interest from the buyer. Every single part of the car has to say the same thing and show the same vision of the purpose of the car.

I wish most Java developers understood this and instead of producing tons of worthless crappy code with interfaces in name of reusability that nobody could maintain would be more useful to society flipping burgers at McDonald's instead.

Specificity also permeates the pattern - everything is specific, no generic interfaces, hard types everywhere, hard implementations, maximum checking of specific inconsistencies and correctness at compile time. Nothing ideally is left where users can make trivial mistakes, like parsing json by hand, sending raw SQL queries to the database, forgetting to configure certain secret and so on.

Marriage

You have two choices as a man:

  • A used up slut
  • A virgin

Naturally testosterone dictates to a man that marrying a non virgin slut is disgusting. And a soy latte sipping leftist faglet is happy marrying up a slut that had tens of sexual partners because that's the only woman he'll ever get.

A womans entire desire and purpose and life ought to be to worship and please the one and only man. One woman should naturally be thinking about one specific man that should always be in her mind.

What is the case with a slut? She slept with tens if not hundred of chad guys. Will she ever think only of one man that swept her of her feet? No, she became damaged, generic, used up and not a wife material. Man that marries up such slut will certainly won't be the most attractive man she ever met and she'll be constantly thinking about the great sex she had before she married this chump. And infidelity with divorce is very likely. If someone doesn't know, statistical chance of divorce increases exponentially with brides sexual partner count.

So, marry a virgin so her heart would only be filled with specific things about you and don't marry a slut, which has her heart filled with all the other chads before you. Just like a used up, smurfed out Java projects, with many interfaces for all the use cases which will never be able to compete with dedicated solutions to a certain problem.