From time to time I have to deal with my least favorite subjects like DB2. While DB2 has a functional SQL, it’s dated and has not really kept up with SQL language improvements over the years. But it is all that I have to flow data from the IBM AS400 to the outside world (** it could be worst… like dealing with old COBOL programs).
So lets talk about updating column values that involve more that one table. Yeah, joined tables that have to have one or more columns updated. Why would one need to do this? From time to time one needs to collect batches of rows from more than one table. And in the process of collecting batches one needs to update a “status” and/or date time stamp column to indicate the row was collected. It’s a common pattern in system integrations. When only one table is involved, it’s a simple matter. But when the data collected involves joined tables, then matters get complicated.
Getting right to it, here’s the general form:
Update <Table1> SET <StatusColumn> = 1 Where (Select DISTINCT <Table1>.ColumnA, <Table1>.ColumnB, <Table1>.ColumnC From <Table1> <Tabel2> Where <Table1>.ColumnA = <Table2>.ColumnA1 AND <Table1>.ColumnB = <Table2>.ColumnB1 AND <Table1>.ColumnC = <Table2>.ColumnC1 )
The two tables, in our example, have three columns we are Joining on… note this is an implicit join where the actual Join statement is not used.
This update statement is a normal way to handle updating a column with a join select in the where. But this goes off the rails when it comes to DB2. Meaning it does not work. So how does one fix this?
Update <Table1> SET <StatusColumn> = 1 Where (ColumnA, ColumnB, ColumnB) IN (Select DISTINCT <Table1>.ColumnA, <Table1>.ColumnB, <Table1>.ColumnC From <Table1> <Tabel2> Where <Table1>.ColumnA = <Table2>.ColumnA1 AND <Table1>.ColumnB = <Table2>.ColumnB1 AND <Table1>.ColumnC = <Table2>.ColumnC1 )
One adds to the Update Where clause a set of columns whose values are IN the Select of the joined tables. Like I said matters get a bit complicated for Joined tables.
So what happens in the first type of Update in DB2? It does execute, but it updates all rows in the table to the Set value. This happens because, why the Join Select does execute, there’s nothing in DB2 that allows it to be an implicit part of the Update’s Where clause. Yeah, got to love that! What is one really saying there?
The Where clause of the Update, in DB2, does not allow one to do just a Select… like other SQL implementations. It’s a hidden flaw, because while it looks to execute without an error, it simply does not do the intent. Therefore, one has to explicitly make sure that the Rows selected in the Where clause are part of the Update. Hence, the (columns) IN.
Yes, as I mentioned, DB2 SQL not amongst my favorite languages to deal with.