Dell Precision 7820 Xeon Workstation, dual Quadro RTX 4000 SLI performance stability woes using AutoCAD 2018 Navisworks Revizto PractiCAD

by Andy Flagg, Publication Date: Thursday, December 3, 2020
I spent a whole day on trying to stabilize a Dell workstation and its ability to run 3 CAD modeling programs at the same time using two each Dell provided RTX 4000 video cards in SLI mode. Two exact systems did not work for like 2 months since they got them, specialty built, about $8k each. So here goes my Dec 3rd wasted yet challenged. Mind you, the entire office and staff were happy to see I could confirm their headaches and cussing and would not tolerate these kind of issues for 2 hours or more let alone 2 months.
So, with that said, After 7 hours straight, aside from going to get some DP-HDMI video cables from a wholesaler down the street, and countless attempts to stabilize and prevent lockups, I might have found the two each RTX 4000 in SLI don't work very well. Single RTX 4000 was acceptable performance and stable with DEP enabled. They 8/8 versus 8/16 hyper-threading wasn't liked by one of the modeling programs so I get it, they would force the bios configuration change and reboot; I guess special programmers don't like look ahead pipelines, etc. Well, for a $8k system, you are throttled to 8/8 instead of 8/16, whatever. I can tell you. I would have a conversation with the programmers on that issue. More to come on that subject.
So here are the configuration basics:
* Dell Precision 7280 Xeon Extreme, 32GB ram, 2 each SSD drives (oddly the original ship shows an m.2 2280)?, Windows 10 64 bit, 2 each NVIDIA (dell p/n) Quadro RTX 4000 video cards (see OEM support link at the bottom)
* Autocad 2018 with PractiCAD, Navisworks, Revizto
* Two programs running.. fine, all three then slow to locks up... 3 monitors... LG 32" SVGA/HDMI ports of some sort; nice. I check the LG specs and it was good.
* Two months old computers, on time about 50 hours or so... locks up and just has to go back to old computer on Windows 7... the three programs worked, slow, but worked, and just low on disk space. 3rd party support recommendation upgrade hardware and os and applications.
Dell support and 3rd party it support just could not figure it out after 3 different reviews and 3 different attempts of their short lived visits. What can we do they said? find a 3rd party to look at it.
This is where I step in per a special request from a recommended 3rd party.... and here we go... a challenge after 1-2 hours, and still unacceptable behavior.. 5 complex variables and had to isolate which one or pair was the problem(s). the 1-2 hours aka 3 hour tour became a 7 hours full pull..
let's get started...
1. 21 point @GGPCTU #GGPCTU Tune-up, all the windows updates and patches, review and clear and monitor eventviewer, task and performance viewer, logs, all monitored. nothing. just autocad not responding with 3 modeling programs all open with some pretty huge files. 4GB network model file, 186MB local autocad file in a dropbox folder, etc.
2. Drivers and program reset preferences, program repairs, sfc scans, nada...
3. video cables swap out of DP to SVGA instead to DP to HDMI
steps. 1 - 3 showed no signs of helping... mind you that was after a few hours
4. mobo bios reviewed, 5 things not enabled that I normally enable for this type of workstation.
* the bios/mobo it was OEM custom designed, one could tell. therefore, enable fans from auto to on 25% since the RTX 4000 were really hot and the chassis had no additional fan support for the dual RTX 4000 addition, and hyper threading enable, from 8/8, an 8 core set to just 8 threads, instead of 8/16 yet one of the modeling programs forced the CPU and MOBO back to 8/8. odd. note: raid mode selected, not ahci nor legacy/ide mode.
5. just thinking about all the thrashing and lock ups, disk I/O, memory I/O, GPU I/O. none of that was evident for these lock ups.
6. so pull a video card and go with one RTX 4000. why not, maybe the modeling was struggling with SLI and vendor driver support on the three modeling programs? well that result was interesting... stable... faster and don't know why.. but stable. fyi ... all the autocad performance tricks for dual cards was enabled, and all the 3d etc was enabled and mode modeling support enabled, so do go back down that path of woe.
Finally, after 7 hours almost straight and it made no sense, all apps configured, a matching system next to it almost with the same behaviors just not as back, just autocad 2021 instead of autocad 2018... still locks up, full freeze for 15 minutes, then release and freeze... several times and slow performance... gigabit network to 3D model source on the net.. yes, some of the files were cached in Dropbox.
I mean the performance monitor is moving along nothing stuck except not responding to the autocad app and nothing else.. just a full 100% lock, 3 apps, 2 fine, 3 bad, totally quiet and full lock up and freeze --
so are we talking memory collisions, memory sharing violation and windows 10 won't fault and BSOD nor thrown and exception, yet it should.... what if we look at DEP and just enable that.. I saw that earlier and thought, nah.. why would that matter, this is not a server...what if it was DEP like on servers that need to just be enabled? so I did, and voila..
I still pondered... after 7 hours.... DEP seemed like a logical step. was there memory trashing and thrashing? system stable, I mean this is a workstation, a heavy duty one in fact, yet it was behaving badly like every app needed isolation and no sharing or thrashing of resource heaps and cross overs between unused shared resources. I mean most workstations DEP should be off and only one and the steps to isolate an APP is rare, few and far between. What the heck. the AutoCAD 2018 2D was fine the 3D still a little jerky (app spread across two screens, 2D on the center screen, 3D on the right and Navisworks and Revizto on the 3rd left screen.
Today, we will see. We do have to get the user settings MENULOAD refigured for Autocad PractiCAD since it won't load right after resetting settings.
Note: after getting done for the day working with this client, I looked at my history with the M2000 and M4000 versus the RTX 4000 and yes, the difference in cards is major. a 2015 card using Maxwell tech versus a 2018 card using Turing. I have built 20 systems for civil structural survey engineering in the past few years, using M2000 M4000 etc cards and rock solid on i7-8700K and i9-9900K and no problems, using premo ssd and ddr4. whatever. the specs said the RTX4000 was better than the M4000 M6000 yet I wonder when it comes to the software programmers and their hardware optimizations between 2018 and 2021 AutoCAD. I know the issue and deal is real. The RTX 4000 had better performance yet the M4000 and M6000 had better performance overall in some of the key areas of using AutoCAD. What if the M4000 worked better in Autocad 2018 and the RTX 4000 worked better with Autocad 2021. There was a lot of discussion on this difference, whereas the programmers for AutoCAD and modeling changed alot and the hardware needed to change with it and not necessarily forward and back compatible.
REF:  http://www.meta-lab.com/index.shtml?/english/pdsw.shtml
note: none of the video software drivers from Dell for this workstation by service tag was in the OEM original build configuration. the card did have a dell part number sticker on it, yet the 3 dell drivers and direct nvidia (3 versions) did not work. had to let Windows figure it out after 5-10 minutes of waiting for the HAL to figure out what it had and Internet update failed, just had to let the INF files, I guess get re-read.
REF: Service Tag # JT1CH63 @ https://www.dell.com/support/home/en-us/product-support/servicetag/0-YkhPRUdxUERDSktqVkNROVFoSHZPQT090/overview
My M4000 vs RTX 4000 performance review, just look at the price flip flop from release to now and the tech.
