codesod

CodeSOD: The First 10,000

Alicia recently inherited a whole suite of home-grown enterprise applications. Like a lot of these kinds of systems, it needs to do batch processing. She went tracking down a mysterious IllegalStateException only to find this query causing the problem:

select * from data_import where id > 10000

The query itself is fine, but the code calling it checks to see if this query returned any rows- if it did, the code throws the IllegalStateException.

First, of course, this should be a COUNT(*) query- no need to actually return rows here. But also… what? Why do we fail if there are any transactions with an ID greater than 10000? Why on Earth would we care?

Well, the next query it runs is this:

update data_import set id=id+10000

Oh. Oh no. Oh nooooo. Are they… are they using the ID to also represent some state information about the status of the record? It sure seems like it!

The program then starts INSERTing data, using a counter which starts at 1. Once all the new data is added, the program then does:

delete from data_import where id > 10000

All this is done within a single method, with no transactions and no error handling. And yes, this is by design. You see, if anything goes wrong during the inserts, then the old records don't get deleted, so we can see that processing failed and correct it. And since the IDs are sequential and always start at 1, we can easily find which row caused the problem. Who needs logging or any sort of exception handling- just check your IDs.

The underlying reason why this started failing was because the inbound data started trying to add more than 10,000 rows, which meant the INSERTs started failing (since we already had rows there for this). Alicia wanted to fix this and clean up the process, but too many things depended on it working in this broken fashion. Instead, her boss implemented a quick and easy fix: they changed "10000" to "100000".

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.




codesod

CodeSOD: Querieous Strings

When processing HTTP requests, you frequently need to check the parameters which were sent along with that request. Those parameters are generally passed as stringly-typed key/value pairs. None of this is news to anyone.

What is news, however, is how Brodey's co-worker indexed the key/value pairs.

For i As Integer = 0 To (Request.Params().Count - 1)
    If (parameters.GetKey(i).ToString() <> "Lang") Then
        If (parameters.GetKey(i).Equals("ID")) OrElse (parameters.GetKey(i).Equals("new")) OrElse _
             (parameters.GetKey(i).Equals("open")) OrElse (parameters.GetKey(i).Equals("FID")) _
         OrElse (parameters.GetKey(i).Equals("enabled")) OrElse (parameters.GetKey(i).Equals("my")) OrElse _
         (parameters.GetKey(i).Equals("msgType")) OrElse (parameters.GetKey(i).Equals("Type")) _
         OrElse (parameters.GetKey(i).Equals("EID")) OrElse (parameters.GetKey(i).Equals("Title")) OrElse _
         (parameters.GetKey(i).Equals("ERROR")) Then
            URLParams &= "&" & parameters.GetKey(i).ToString()
            URLParams &= "=" & parameters(i).ToString()
        End If
    End If
Next

The goal of this code is to take a certain set of keys and construct a URLParams string which represents those key/values as an HTTP query string. The first thing to get out of the way: .NET has a QueryString type that handles the construction of the query string for you (including escaping), so that you don't need to do any string concatenation.

But the real WTF is everything surrounding that. We opt to iterate across every key- not just the ones we care about- and use the GetKey(i) function to check each individual key in an extensive chain of OrElse statements.

The obvious and simpler approach would have been to iterate across an array of the keys I care about- ID, new, FID, enabled, my, msgType, Type, EID, Title, ERROR- and simply check if they were in the Request.

I suppose the only silver lining here is that they thought to use the OrElse operator- which is a short-circuiting "or" operation, like you'd expect in just about any other language, instead of Or, which doesn't short circuit (pulling double duty as both a bitwise Or and a logical Or, because Visual Basic wants to contribute some WTFs).

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!




codesod

CodeSOD: Join Our Naming

As a general rule, if you're using an RDBMS and can solve your problem using SQL, you should solve your problem using SQL. It's how we avoid doing joins or sorts in our application code, which is always a good thing.

But this is a general rule. And Jasmine sends us one where solving the problem as a query was a bad idea.

ALTER   FUNCTION [dbo].[GetName](@EntityID int)

RETURNS varchar(200)

AS

BEGIN

declare @Name varchar(200)

select @Name =
  case E.EntityType
    when 'Application'  then A.ApplicationName
    when 'Automation'   then 'Automated Process'
    when 'Group'        then G.GroupName
    when 'Organization' then O.OrgName
    when 'Person'       then P.FirstName + ' ' + P.LastName
    when 'Resource'     then R.ResourceName
    when 'Batch'        then B.BatchComment
  end
from Entities E
left join AP_Applications A   on E.EntityID = A.EntityID
left join CN_Groups G         on E.EntityID = G.EntityID
left join CN_Organizations O  on E.EntityID = O.EntityID
left join CN_People P         on E.EntityID = P.EntityID
left join Resources R         on E.EntityID = R.EntityID
left join AR_PaymentBatches B on E.EntityID = B.EntityID
where E.EntityID = @EntityID

return @Name

END

The purpose of this function is to look up the name of an entity. Depending on the kind of entity we're talking about, we have to pull that name from a different table. This is a very common pattern in database normalization- a database equivalent of inheritance. All the common fields to all entities get stored in an Entities table, while specific classes of entity (like "Applications") get their own table which joins back to the Entities table.

On the surface, this code doesn't even really look like a WTF. By the book, this is really how you'd write this kind of function- if we were going by the book.

But the problem was that these tables were frequently very large, and even with indexes on the EntityID fields, it simply performed horribly. And since "showing the name of the thing you're looking at" was a common query, that performance hit added up.

The fix was easy- write out seven unique functions- one for each entity type- and then re-write this function to use an IF statement to decide which one to execute. The code was simpler to understand and read, and performed much faster.

In the end, perhaps not really a WTF, or perhaps the root WTF is some of the architectural decisions which allow this to exist (why a function for getting the name, and the name alone, which means we execute this query independently and not part of a more meaningful join?). But I think it's an interesting example of how "this is the right way to do it" can lead to some unusual outcomes.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!




codesod

CodeSOD: Trophy Bug Hunting

Quality control is an important business function for any company. When your company is shipping devices with safety concerns, it's even more important. In some industries, a quality control failure is bound to be national headlines.

When the quality control software tool stopped working, everyone panicked. At which point, GRH stepped in.

Now, we've discussed this software and GRH before, but as a quick recap, it was:

written by someone who is no longer employed with the company, as part of a project managed by someone who is no longer at the company, requested by an executive who is also no longer at the company. There are no documented requirements, very few tests, and a lot of "don't touch this, it works".

And this was a quality control tool. So we're already in bad shape. It also had been unmaintained for years- a few of the QC engineers had tried to take it over, but weren't programmers, and it had essentially languished.

Specifically, it was a quality control tool used to oversee the process by about 50 QC engineers. It automates a series of checks by wrapping around third party software tools, in a complex network of "this device gets tested by generating output in program A, feeding it to program B, then combining the streams and sending them to the device, but this device gets tested using programs D, E, and F."

The automated process using the tool has a shockingly low error rate. Without the tool, doing things manually, the error rate climbs to 1-2%. So unless everyone wanted to see terrifying headlines in the Boston Globe about their devices failing, GRH needed to fix the problem.

GRH was given the code, in this case a a zip file on a shared drive. It did not, at the start, even build. After fighting with the project configuration to resolve that, GRH was free to start digging in deeper.

Public Sub connect2PCdb()
        Dim cPath As String = Path.Combine(strConverterPath, "c.pfx")
        Dim strCN As String

        ' JES 12/6/2016: Modify the following line if MySQL server is changed to a different server.  A dump file will be needed to re-create teh database in the new server.
        strCN = "metadata=res://*/Model1.csdl|res://*/Model1.ssdl|res://*/Model1.msl;provider=MySql.Data.MySqlClient;provider connection string='server=REDACTED;user id=REDACTED;database=REDACTED;sslmode=Required;certificatepassword=REDACTED;certificatefile=REDACTEDc.pfx;password=REDACTED'"
        strCN = Regex.Replace(strCN, "certificatefile=.*?pfx", "certificatefile=" & cPath)
        pcContext = New Entities(strCN)
        strCN = "metadata=res://*/Model1.csdl|res://*/Model1.ssdl|res://*/Model1.msl;provider=MySql.Data.MySqlClient;provider connection string='server=REDACTED;user id=REDACTED;persistsecurityinfo=True;database=REDACTED;password=REDACTED'"
        strCN = Regex.Match(strCN, ".*'(.*)'").Groups(1).Value

        Try
            strCN = pcContext.Database.Connection.ConnectionString
            cnPC.ConnectionString = "server=REDACTED;user id=REDACTED;password=REDACTED;database=REDACTED;"
            cnPC.Open()
        Catch ex As Exception

        End Try
    End Sub

This is the code which connects to the backend database. The code is in the category of more of a trainwreck than a WTF. It's got a wonderful mix of nonsense in here, though- a hard-coded connection string which includes plaintext passwords, regex munging to modify the string, then hard-coding a string again, only to use regexes to extract a subset of the string. A subset we don't use.

And then, for a bonus, the whole thing has a misleading comment- "modify the following line" if we move to a different server? We have to modify several lines, because we keep copy/pasting the string around.

Oh, and of course, it uses the pattern of "open a database connection at application startup, and just hold that connection forever," which is a great way to strain your database as your userbase grows.

The good news about the hard-coded password is that it got GRH access to the database. With that, it was easy to see what the problem was: the database was full. The system was overly aggressive with logging, the logs went to database tables, the server was an antique with a rather small hard drive, and the database wasn't configured to even use all of that space anyway.

Cleaning up old logs got the engineers working again. GRH kept working on the code, though, cleaning it up and modernizing it. Updating to latest version of the .NET Core framework modified the data access to be far simpler, and got rid of the need for hard-coded connection strings. Still, GRH left the method looking like this:

    Public Sub connect2PCdb()
        'Dim cPath As String = Path.Combine(strConverterPath, "c.pfx")
        'Dim strCN As String

        ' JES 12/6/2016: Modify the following line if MySQL server is changed to a different server.  A dump file will be needed to re-create teh database in the new server.
        'strCN = "metadata=res://*/Model1.csdl|res://*/Model1.ssdl|res://*/Model1.msl;provider=MySql.Data.MySqlClient;provider connection string='server=REDACTED;user id=REDACTED;database=REDACTED;sslmode=Required;certificatepassword=REDACTED;certificatefile=REDACTEDc.pfx;password=REDACTED'"
        'strCN = Regex.Replace(strCN, "certificatefile=.*?pfx", "certificatefile=" & cPath)
        'pcContext = New Entities(strCN)
        'strCN = "metadata=res://*/Model1.csdl|res://*/Model1.ssdl|res://*/Model1.msl;provider=MySql.Data.MySqlClient;provider connection string='server=REDACTED;user id=REDACTED;persistsecurityinfo=True;database=REDACTED;password=REDACTED'"
        'strCN = Regex.Match(strCN, ".*'(.*)'").Groups(1).Value

        'GRH 2021-01-15.  Connection information moved to App.Config
        'GRH 2021-08-13.  EF Core no longer supports App.Config method
        pcContext = New PcEntities

        Try
            ' GRH 2021-08-21  This variable no longer exists in .NET 5
            'strCN = pcContext.Database.Connection.ConnectionString
            ' GRH 2021-08-20  Keeping the connection open causes EF Core to not work
            'cnPC.ConnectionString = "server=REDACTED;user id=REDACTED;password=REDACTED;database=REDACTED;SslMode=none"
            'cnPC.Open()
        Catch ex As Exception

        End Try
    End Sub

It's now a one-line method, with most of the code commented out, instead of removed. Why on Earth is the method left like that?

GRH explains:

Yes, I could delete the function as it is functionally dead, but I keep it for the same reasons that a hunter mounts a deer's head above her mantle.

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!




codesod

CodeSOD: A Base Nature

Once again, we take a look at the traditional "if (boolean) return true; else return false;" pattern. But today's, from RJ, offers us a bonus twist.

public override bool IsValid
{
   get
   {
      if (!base.IsValid)
         return false;

      return true;
   }
}

As promised, this is a useless conditional. return base.IsValid would do the job just as well. Except, that's the twist, isn't it. base is our superclass. We're overriding a method on our superclass to… just do what the base method does.

This entire function could just be deleted. No one would notice. And yet, it hasn't been. Everyone agrees that it should be, yet it hasn't been. No one's doing it. It just sits there, like a pimple, begging to be popped.

[Advertisement] Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.




codesod

CodeSOD: All the Rest Have 31

Horror movies, as of late, have gone to great lengths to solve the key obstacle to horror movies- cell phones. When we live in a world where help is a phone call away, it's hard to imagine the characters not doing that. So screenwriters put them in situations where this is impossible: in Midsommar they isolate them in rural Sweden, in Get Out calling the police is only going to put our protagonist in more danger. But what's possibly more common is making the film a period piece- like the X/Pearl/Maxxxine trilogy, Late Night with the Devil, or Netflix's continuing series of R.L. Stine adaptations.

I bring this up, because today's horror starts in 1993. A Norwegian software company launched its software product to mild acclaim. Like every company, it had its ups and downs, its successes and missteps. On the surface, it was a decent enough place to work.

Over the years, the company tried to stay up to date with technology. In 1993, the major languages one might use for launching a major software product, your options are largely C or Pascal. Languages like Python existed, but weren't widely used or even supported on most systems. But the company stayed in business and needed to update their technology as time passed, which meant the program gradually grew and migrated to new languages.

Which meant, by the time Niklas F joined the company, they were on C#. Even though they'd completely changed languages, the codebase still derived from the original C codebase. And that meant that the codebase had many secrets, dark corners, and places a developer should never look.

Like every good horror movie protagonist, Niklas heard the "don't go in there!" and immediately went in there. And lurking in those shadows was the thing every developer fears the most: homebrew date handling code.

/// <summary>
/// 
/// </summary>
/// <param name="dt"></param>
/// <returns></returns>
public static DateTime LastDayInMonth(DateTime dt)
{
	int day = 30;
	switch (dt.Month)
	{
		case 1:
			day = 31;
			break;
		case 2:
			if (IsLeapYear(dt))
				day = 29;
			else
				day = 28;
			break;
		case 3:
			day = 31;
			break;
		case 4:
			day = 30;
			break;
		case 5:
			day = 31;
			break;
		case 6:
			day = 30;
			break;
		case 7:
			day = 31;
			break;
		case 8:
			day = 31;
			break;
		case 9:
			day = 30;
			break;
		case 10:
			day = 31;
			break;
		case 11:
			day = 30;
			break;
		case 12:
			day = 31;
			break;
	}
	return new DateTime(dt.Year, dt.Month, day, 0, 0, 0);
}

/// <summary>
/// 
/// </summary>
/// <param name="dt"></param>
/// <returns></returns>
public static bool IsLeapYear(DateTime dt)
{
	bool ret = (((dt.Year % 4) == 0) && ((dt.Year % 100) != 0) || ((dt.Year % 400) == 0));
	return ret;
}

For a nice change of pace, this code isn't incorrect. Even the leap year calculation is actually correct (though my preference would be to just return the expression instead of using a local variable). But that's what makes this horror all the more insidious: there are built-in functions to handle all of this, but this code works and will likely continue to work, just sitting there, like a demon that we've made a pact with. And suddenly we realize this isn't Midsommar but Ari Aster's other hit film, Hereditary, and we're trapped being in a lineage of monsters, and can't escape our inheritance.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!




codesod

CodeSOD: A Matter of Understanding

For years, Victoria had a co-worker who "programmed by Google Search"; they didn't understand how anything worked, they simply plugged their problem into Google search and then copy/pasted and edited until they got code that worked. For this developer, I'm sure ChatGPT has been a godsend, but this code predates its wide use. It's pure "Googlesauce".

    StringBuffer stringBuffer = new StringBuffer();
    stringBuffer.append("SELECT * FROM TABLE1 WHERE COLUMN1 = 1 WITH UR");

    String sqlStr = stringBuffer.toString();
    ps = getConnection().prepareStatement(sqlStr);

    ps.setInt(1, code);

    rs = ps.executeQuery();

    while (rs.next())
    {
      count++;
    }

The core of this WTF isn't anything special- instead of running a SELECT COUNT they run a SELECT and then loop over the results to get the count. But it's all the little details in here which make it fun.

They start by using a StringBuffer to construct their query- not a horrible plan when the query is long, but this is just a single, simple, one-line query. The query contains a WITH clause, but it's in the wrong spot. Then they prepareStatement it, which does nothing, since this query doesn't contain any parameters (and also, isn't syntactically valid). Once it's prepared, they set the non-existent parameter 1 to a value- this operation will throw an exception because there are no parameters in the query.

Finally, they loop across the results to count.

The real WTF is that this code ended up in the code base, somehow. The developer said, "Yes, this seems good, I'll check in this non-functional blob that I definitely don't understand," and then there were no protections in place to keep that from happening. Now it falls to more competent developers, like Victoria, to clean up after this co-worker.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!




codesod

CodeSOD: Counting it All

Since it's election day in the US, many people are thinking about counting today. We frequently discuss counting here, and how to do it wrong, so let's look at some code from RK.

This code may not be counting votes, but whatever it's counting, we're not going to enjoy it:

case LogMode.Row_limit: // row limit excel = 65536 rows
    if (File.Exists(personalFolder + @"" + fileName + ".CSV"))
    {
        using (StreamReader reader = new StreamReader(personalFolder + @"" + fileName + ".CSV"))
        {
            countRows = reader.ReadToEnd().Split(new char[] { '
' }).Length;
        }
    }

Now, this code is from a rather old application, originally released in 2007. So the comment about Excel's row limit really puts us in a moment in time- Excel 2007 raised the row limit to 1,000,000 rows. But older versions of Excel did cap out at 65,536. And it wasn't the case that everyone just up and switched to Excel 2007 when it came out- transitioning to the new Office file formats was a conversion which took years.

But we're not even reading an Excel file, we're reading a CSV.

I enjoy that we construct the name twice, because that's useful. But the real magic of this one is how we count the rows. Because while Excel can handle 65,536 rows at this time, I don't think this program is going to do a great job of it- because we read the entire file into memory with ReadToEnd, then Split on newlines, then count the length that way.

As you can imagine, in practice, this performed terribly on large files, of which there were many.

Unfortunately for RK, there's one rule about old, legacy code: don't touch it. So despite fixing this being a rather easy task, nobody is working on fixing it, because nobody wants to be the one who touched it last. Instead, management is promising to launch a greenfield replacement project any day now…

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.




codesod

CodeSOD: Uniquely Validated

There's the potential for endless installments of "programmers not understanding how UUIDs work." Frankly, I think the fact that we represent them as human readable strings is part of the problem; sure, it's readable, but conceals the fact that it's just a large integer.

Which brings us to this snippet, from Capybara James.

    if (!StringUtils.hasLength(uuid) || uuid.length() != 36) {
        throw new RequestParameterNotFoundException(ErrorCodeCostants.UUID_MANDATORY_OR_FORMAT);
    }

StringUtils.hasLength comes from the Spring library, and it's a simple "is not null or empty" check. So- we're testing to see if a string is null or empty, or isn't exactly 36 characters long. That tells us the input is bad, so we throw a RequestParameterNotFoundException, along with an error code.

So, as already pointed out, a UUID is just a large integer that we render as a 36 character string, and there are better ways to validate a UUID. But this also will accept any 36 character string- as long as you've got 36 characters, we'll call it a UUID. "This is valid, really valid, dumbass" is now a valid UUID.

With that in mind, I also like the bonus of it not distinguishing between whether or not the input was missing or invalid, because that'll make it real easy for users to understand why their input is getting rejected.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.




codesod

CodeSOD: Pay for this Later

Ross needed to write software to integrate with a credit card payment gateway. The one his company chose was relatively small, and only served a handful of countries- but it covered the markets they cared about and the transaction fees were cheap. They used XML for data interchange, and while they had no published schema document, they did have some handy-dandy sample code which let you parse their XML messages.

$response = curl_exec($ch);
$authecode = fetch_data($response, '<authCode>', '</authCode>');
$responsecode = fetch_data($response, '<responsecode>', '</responsecode>');
$retrunamount = fetch_data($response, '<returnamount>', '</returnamount>');
$trxnnumber = fetch_data($response, '<trxnnumber>', '</trxnnumber>');
$trxnstatus = fetch_data($response, '<trxnstatus>', '</trxnstatus>');
$trxnresponsemessage = fetch_data($response, '<trxnresponsemessage>', '</trxnresponsemessage>');

Well, this looks… worrying. At first glance, I wonder if we're going to have to kneel before Z̸̭͖͔͂̀ā̸̡͖͕͊l̴̜͕͋͌̕g̸͉̳͂͊ȯ̷͙͂̐. What exactly does fetch_data actually do?

function fetch_data($string, $start_tag, $end_tag)
{

  $position = stripos($string, $start_tag);
  $str = substr($string, $position);
  $str_second = substr($str, strlen($start_tag));
  $second_positon = stripos($str_second, $end_tag);
  $str_third = substr($str_second, 0, $second_positon);
  $fetch_data = trim($str_third);
  return $fetch_data;
}

Phew, no regular expressions, just… lots of substrings. This parses the XML document with no sense of the document's structure- it literally just searches for specific tags, grabs whatever is between them, and calls it done. Nested tags? Attributes? Self-closing tags? Forget about it. Since it doesn't enforce that your open and closing tags match, it also lets you grab arbitrary (and invalid) document fragments- fetch_data($response, "<fooTag>", "<barTag>"), for example.

And it's not like this needs to be implemented from scratch- PHP has built-in XML parsing classes. We could argue that by limiting ourselves to a subset of XML (which I can only hope this document does) and doing basic string parsing, we've built a much simpler approach, but I suspect that after doing a big pile of linear searches through the document, we're not really going to see any performance benefits from this version- and maintenance is going to be a nightmare, as it's so fragile and won't work for many very valid XML documents.

It's always amazing when TRWTF is neither PHP nor XML but… whatever this is.

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!




codesod

CodeSOD: Bad Code and Taxes

Here in the US, “tax season” is extended into the summer. No one likes dealing with taxes, obviously, but we agree that the social benefits outweigh the costs. I can’t speak to how folks feel in...




codesod

CodeSOD: WTFYou, Pay Me

Julien’s employer has switched their payroll operations to a hosted solution. The hosted solution has some… interesting features. The fact that it has a “share” button, implying you can share your...




codesod

CodeSOD: A Tern Off

Jim J's co-worker showed him this little snippet in the codebase. foreach (ToolStripMenuItem item in documentMenuItem.DropDownItems) { item.Enabled = item.Enabled ? Document.Status ==...




codesod

CodeSOD: The Evil CMS

Content Management Systems always end up suffering, at least a little, from the Inner Platform Effect. There’s the additional problem that, unlike say, a big ol’ enterprise HR system or similar,...




codesod

CodeSOD: A Quick Escape

I am old. I’m so old that, when I entered the industry, we didn’t have specializations like “frontend” and “backend” developers. You just had developers, and everybody just sort muddled about. As web...




codesod

CodeSOD: The Sound of GOTO

Let's say you have an audio file, or at least, something you suspect is an audio file. You want to parse through the file, checking the headers and importing the data. If the file is invalid,...




codesod

CodeSOD: Reasonable Lint

While testing their application, Nicholas found some broken error messages. Specifically, they were the embarassing “printing out JavaScript values” types of errors, so obviously something was broken...




codesod

CodeSOD: Dating Automation

Good idea: having QA developers who can build tooling to automate tests. Testing is tedious, testing needs to be executed repeatedly, and we're not just talking simple unit tests, but in an ideal...