Articles for the Month of August 2014

Forcing a Database out of Single-User Mode Using Brute Force

This is what I found out lately: “Sometimes, brute force is the best way to get something done.

Let’s say you have a production server and you were in the middle of a detach process, when the detach failed after the database went into single-user mode. In my situation somehow I ended up with a parallel deadlock (where process “X” is blocked by “Y” which in turn is blocked by “X”).

So I tried to use ALTER DATABASE TestDB SET MULTI_USER to bring it back into multi user mode, but I was getting the following error:

Error 1205: Transaction (Process ID) was deadlocked on resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

So my first though was to set DEADLOCK_PRIORITY to HIGH and try it. The logic is that since deadlock priority can be set by the user I wanted to tell the SQL Server to allow my process to continue and drop the other process. So I tried:

USE master;
GO

SET DEADLOCK_PRIORITY HIGH;
GO

ALTER DATABASE TestDB SET MULTI_USER WITH NO_WAIT;

This failed too and I figured it made sense since you would always get a deadlock between the ALTER DATABASE session and the application’s session already exclusively using the database. So basically, the ALTER DATABASE session was chosen as deadlock victim since generally the transaction that requires the least amount of overhead to rollback is the transaction that is aborted!

So now What?!? That was when I found this thread online that talks about the brute force method when all your niceness is not getting you anywhere!

The idea is pretty simple, and not so much elegant; but hey, you can’t complain when it works! You are basically running the line below a 1000 times and you are hoping that during one of these tries SQL server realizes that you are not giving up and it might be cheaper on its resources to set your session as the exclusive session.

USE TestDB;
GO 1000

It took it about 300 tries before SQL Server gave up and gave me the exclusive session, from which I ran the ALTER DATABASE script to set the DB back to multi-user mode.

Deleted the SQL Server Resource from SQL Cluster? Recover it….

There is a chance that any of the resources in the SQL Cluster is removed from the cluster by mistake. This will cause the cluster to go into a Partial Online or Failure status. In order to fix the issue you need to recreate the resources in the cluster.

Following shows what to do when the SQL Server resource “SQL Server (SQLINSTANCE01)” and SQL Server Agent resource “SQL Server Agent (SQLINSTANCE01)” have been deleted from your SQL Cluster.

  1. First, using Domain Controller take a look at your OU to make sure that the cluster object belonging to your SQL Cluster is still there (ESPDBUAT01). A quick look shows that our cluster object (ESPDBUAT01) is still in AD, so no recovery is needed there.
    01-InitialDCCheck
  2. As you can see, your SQL Instance is down.
    02-SQLDown
  3. So you have to clean the role before you can anything, which means deleting the server associated with this role. So right click on Name: ESPDBUAT01 under Server Name and click Remove.
    03
  4. Click Yes.
    04
  5. Now you need to remove all disks from the cluster, so go to Storage, click on Disks, and highlight all the disks that are assigned to ESPDBUAT01 role, then click Remove from ESPDBUAT01.
    05-RemoveDisks
  6. Click Yes.
    06
    07-DiskAvailable
  7. Make sure there is no other resources assigned to this cluster.
    08-CleanNode
  8. Now you need to rename the old role to something else. Right click on the role’s name, and click Properties.
    09-RenameNode
  9. Change the name of the role (i.e. add -Old to the end of it), and click OK.
    10-RenameNode2
  10. Now if you check the Domain Controller you’ll see that the cluster object is disabled. Now you can begin recreating the entire thing from scratch.
    11-NodeDownInDC
  11. So now we are ready to add the deleted resource. In order to do so, right click on Roles in Failover Cluster Manager, and then select Configure Role…
    12-ConfigureROle
  12. Select the role Other Server, and then click Next.
    13-OtherServer
  13. In Client Access Point page, add the same name and IP Address that had used to setup the cluster object. Once done click Next.
    14-ClientAccessPoint
  14. Select all the disks belonging to this cluster, and then click Next.
    15-AddDisks
  15. In the Select Resource Types page, select the SQL Server and SQL Server Agent resource types, and then click Next.
    16-AddSQLResource
  16. Click Next.
    17-Confirm
  17. Click Finish, but we are not done yet!
    18-Finish
  18. As you can see in the following screenshot, all disks are added and are online. However, the server and the two SQL services are offline. They need to be configured their dependencies and properties before bringing them online.
    19-NodePartialUp
  19. Right click on New SQL Server and go to Properties.
    20-NewSQLServer
  20. Go to Dependencies tab and add the network name (ESPDBUAT01), the IP address, and all the Disks that our cluster uses.
    21-Dependencies
  21. Go to Properties tab and set VirtualServerName and InstanceName to their correct values.
    22-Properties
  22. Go to General tab and change the Name to SQL Server (SQLINSTANCE01), and then click OK.
    23-General
  23. Now we need to do a similar process for the SQL Server Agent. So right click on New SQL Server Agent, and go to Properties.
    24-NewSQLServerAgent
  24. Go to Properties tab and set VirtualServerName and InstanceName to their correct values. Note: The order of the rows in this tab (VirtualServername and InstanceName) might be different than the screenshot bellow. Make sure that you don’t type them in reverse. Also, any typo in this section causes the entire thing not to come online. So, double check your spelling!
    25-Properties
  25. Click on Dependencies tab and add SQL Server (SQLINSTANCE01).
    26-Dependecies
  26. Go to General tab and update the Name of the agent to SQL Server Agent (SQLINSTANCE01). Then click OK.
    27-General
  27. Now you are ready to start the role. Right click on the role (ESPDBUAT01), and click on Start Role. If you have typed everything correctly the role will start without any issues.
    28-StartRole
  28. Ensure that the two services are Online and you can connect to the SQL cluster in SSMS.
    29-RoleOnline
    30-SQLUp
  29. Check the Domain Controller and the cluster object ESPDBUAT01 is back online.
    31-DCUp
  30. Now you can clean up the Failover Cluster Manager. Right click on the old role ESPDBUAT01-Old and click Remove, and then Yes.
    32-RemoveOldRole
    33-Confirm

And you are done! 🙂

Shrinking Oracle TEMP tablespace

There might be times that you need to shrink the TEMP tablespace because you are running out of disk space. Now you can check and query to see what’s the name of the tablespace(s) and their names so you can shrink them. Run the following to see the list of all the temp tablespces:

SELECT tablespace_name, file_name, bytes
FROM dba_temp_files
WHERE tablespace_name like 'TEMP%';

And you’ll get something like this:

oracle_drop_temp_0

The first question you have to ask is why the temporary tablespace is so large. It may be due to a large query that you just ran with a sort that was a mistake or it may be due to some other exceptional circumstance. If that is the case then all you need to do to clean up is to shrink the temporary tablespace. So first try running the following line:

ALTER DATABASE TEMPFILE '/FULL_PATH_HERE/temp01.dbf' RESIZE 256M;

If it works then you are done, but chances are that you will get the following error:

ERROR at line 1:
ORA-03297: file contains used data beyond requested RESIZE value

This error may have occurred since Oracle will only shrink the file if the temporary tablespace is at the head of the file and if it is smaller than the size you specify. If the temporary tablespace you want to shrink is your default temporary tablespace, you will have to first create a new temporary tablespace, set it as the default temporary tablespace then drop your old default temporary tablespace and recreate it. Afterwords drop the second temporary table created. So let’s do this:

Create the new tablespace:

CREATE TEMPORARY TABLESPACE TEMP2
TEMPFILE '/FULL_PATH_HERE/temp02.dbs' SIZE 5M REUSE
AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1M;

Tablespace created.

Set the new tablespace as the default so you can drop the original one:

ALTER DATABASE DEFAULT TEMPORARY TABLESPACE TEMP2;

Database altered.

Now, try dropping the first tablespace. Yes I said “Try” because it might hang and I’ll tell you why later:

DROP TABLESPACE TEMP INCLUDING CONTENTS AND DATAFILES;

If the above statement hangs for more than a couple of seconds, then there might be an OS process that has locked the TEMP tablespace and isn’t allowing Oracle to drop it. So, let’s find out if that’s true and kill the process if possible:

SELECT username ,session_num ,session_addr ,tablespace
FROM v$sort_usage
WHERE tablespace='TEMP'; --put the tablespace's name here

oracle_drop_temp_1

So now that we know there is process locking the tablespace, run the following line to generate the kill statement to kill the annoying session. You need to get the saddr from the last step:

SELECT 'ALTER SYSTEM KILL SESSION ''' || sid ||','|| serial# || ''';'
FROM v$session
WHERE saddr='C00000009E6F6118'; --This is the session_addr from last step

oracle_drop_temp_2

Copy the above line and run it and you should be able to ALTER SYSTEM which kills the process, now we can continue dropping the tablespace. Try again:

DROP TABLESPACE TEMP INCLUDING CONTENTS AND DATAFILES;

Tablespace dropped.

Now, you can recreate the original tablespace:

CREATE TEMPORARY TABLESPACE TEMP
TEMPFILE '/FULL_PATH_HERE/temp01.dbs' SIZE 5M REUSE
AUTOEXTEND ON NEXT 1M MAXSIZE UNLIMITED
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1M;

Mark it as default, and then drop the second tablespace:

ALTER DATABASE DEFAULT TEMPORARY TABLESPACE TEMP;

Database altered.

DROP TABLESPACE TEMP2 INCLUDING CONTENTS AND DATAFILES;

Tablespace dropped.

It should have worked this time… Now you have recreated your TEMP tablespace and saved some diskspace.