CariElf CariElf

Memory fragments

Memory fragments

Our top priority since 1.1 came out has been resolving the memory and performance issues that people have been having. BoundsChecker didn't come up with any significant memory leaks, so that left us with checking the change logs to see what we might have done to cause the issues. We were obviously doing something wacky, because people with 4 GB page files were still getting out of virtual memory errors.

One of the changes we made was to the way that the shipcfg files were parsed. The shipcfg files are really just ini files, which I think Joe picked for the shipcfg format because ini files are supposed to be fast for reading and writing. They have two main disadvantages. The first is that you can't have an ini file greater than 64 kb on Windows ME and Windows 98; it won't be able to read anything past the 64 kb mark. The second is that you have to know what is the size of the longest section name, and the longest key name, or else know what all the section and key names are before you go to read in the data from the file. If you don't know the section and keynames, you have to call APIs to get all the section and keynames that are in the file, and you need a buffer to store them.

There are two ways to allocate memory: static and dynamic. Static means that you always allocate the same amount of memory, no matter how much of it you actually need to use. If you allocate too much, you'll be taking up memory you aren't using, but if you don't allocate enough, you'll run into problems because you don't have enough to use. Dynamic memory is allocated as you need it, when you need it. Since you're in charge of dynamic memory, you have to remember to deallocate (release) it. When you forget to deallocate it, it becomes a memory leak. As far as your program and OS are concerned, that memory is still being used and is unavailable to be allocated to something else. Another bad thing about dynamic allocation is that you can fragment memory.

I'm assuming that most of you have seen Windows' disk defrag program. If you haven't, you might want to go to Start->All Programs->Accessories->System Tools and run it, because your had drive probably needs it. :) It will also give you a visual indication of what I'm talking about in this paragraph. When you create or copy files to your hard drive, the files are copied into consecutive blocks of memory. If you delete one or more of those files, that leaves a hole in memory. Depending on how big the hole is, it might get filled with other files. But if it's too small, it's just wasted. Disk defrag goes through your hard drive and tries to move stuff around so that it is stored more efficiently, with bigger blocks of available memory. Fragmentation can also happen in system memory, aka RAM. So even if you deallocate dynamic memory, there's a chance that the memory you released won't be used again by your program, so it keeps using more memory. When the program runs out of available RAM, it will start using virtual memory. Virtual memory is really just a section of the hard drive that is set aside for the operating system's use when it runs out of RAM.

So how does all of this relate to GC2's issues? The shipcfg files originally used static memory allocation for the buffer that was used to get the section and keynames, but if you added enough jewelry and components to the ship, the buffer wasn't big enough. We needed a quick way to fix this that wouldn't involve re-engineering the shipcfg code and that wouldn't make reading in the shipcfg files take longer. The quickest change to implement was to switch from using static allocation to using dynamic allocation. In order to make sure that the buffer was big enough, I suggested that we dynamically allocate a buffer that was the same size as the file. The bad thing about this solution was that it didn't take into account that we weren't saving the parsed data values in memory after reading in the file for the first time. Every time you built a colony ship, the colony ship cfg file was read in and a buffer was allocated and deallocated. So I'm thinking that we were probably fragmenting memory.

Last week, I wrote code to make GC2 saved the parsed values from the shipcfg files in memory, so that they would only have to be read in once. It would mean that we're hanging on to a little more memory than we were before, but it should cut down on the fragmentation. It definitely cuts down on the amount of time spent creating ship graphics, which you will notice when loading a save game. If it's not enough, we may still have to change the shipcfg file format, but keeping the parsed results in memory will help keep load times down. I wrote the new shipcfg code in such a way that I should only have to replace one function if we need to switch the file format, the one that actually reads in the data. The code that uses the stored data to put together a ship with all its components will not need to be changed.

Another problem area is the save game code. Writing to the hard disk is slow, so it's quicker to create the file in memory and then write it out to the file in one fell swoop rather then writing out each datum as you go. Since the exact file size of a given save game is unknown, the save game code uses dynamic memory allocation. Each object in the game (ie ships, planets, civs, etc) has its own function to create a block of memory containing all the data it needs to store, which it then passes to the main save game function, and is added to the main block. This is using the same code as in GC1, but we had less dynamic data in GC1. Originally, all of the buffers started out as 1 kb and whenever the buffers needed to increase in size, they would allocate their current size + 1 kb and copy the data from the old buffer to the new buffer, then deallocate the old buffer. The process of growing and copying the buffers was taking up more time than the actual saving of data, and I needed a way to improve performance without doing major surgery to the save game code before version 1.0 came out. So I did some profiling on how big the buffers were for a gigantic galaxy in the first few turns of the game and how much they grew, and used those numbers to change the initial buffer sizes and how much they grew by for each data object. This was, admittedly, more of a band-aid than an actual fix.

Apart from adding some new things that needed to be saved, I don't think that we've really touched the save game code much. However, it is still fairly inefficient because of all the buffer growing and copying. So the next change I have started to make is to make all the data objects use one buffer. Once that is done, I can make further optimizations to the code like initializing the buffer size based on the galaxy size, and see if it does more good than harm to keep the buffer in memory so that it doesn't have to allocate and deallocate 2-13 MB (or more) every time the game saves. At the very least, making everything use one buffer should cut down on fragmentation and make the saving go quicker. I will also be reviewing the code to make sure that only necessary data is being saved rather than being recalculated, in an effort to cut down loading time.

Once I've finished making our memory usage more efficient, I'll start working on the modding stuff again.



Edit: Ok, since I'm getting e-mails and comments about this, I would like to clarify that I am not blaming your hard drives for causing the crashes. The point of this article is that I am working on resolving the memory issues. The comment about running disk defrag was meant as a general statement that you should regularly defrag your hard drives, and to provide a visual representation of what is happening when memory is fragmented.

Update: In my sticky thread here Link I put instructions and a link for an unofficial test exe.
--Cari
101,980 views 81 replies
Reply #26 Top

Also, anything wrong with static instead of dynamic allocation?

Well, if we use static allocation for the buffer that reads in the shipcfg files, you would be limited to how much jewelry you could put on ships, because if the section names and key names got longer than the buffer size, stuff would get cut off. 

Does it work the same way the AI ships?

Well, the AI reads in their basic hull and configuration from a shipcfg file, but then they add their weapons, etc. So yes, it will also affect the AI ships.

If I understand correctly, does this mean that you will keep in memory the contents of files corresponding to data wthat won't change. Unless I am mistaken, but once a ship has been designed, the associated shpcfg data won't change until the next use of the ship designer on that design. Maybe a dictionnary like structure maybe be useful for keeping track of buffers that you decide not to regenerate each time you are doing a save. Sometimes, it is more efficient to calculate some data one time for all and memorize them instead of calculating them each time, especially if they aren't subject to lots of modification.

The shipcfgs will now stay in memory, parsed, until a new game is created or a save game loaded. Then they will be cleared so that there aren't a ton of shipcfgs in memory that aren't being used. 

I used to save all the definitions like techs, planet improvements, etc, in a file in the temp folder, and then just copy the data if it hadn't changed, but every time you design a new ship, it changes what needs to be in that block, so it was almost always re-saving that block anyway.  I could probably do it for everything except the ship designs and then just save the ship designs every time. 

Reply #27 Top
Originally, all of the buffers started out as 1 kb and whenever the buffers needed to increase in size, they would allocate their current size + 1 kb and copy the data from the old buffer to the new buffer, then deallocate the old buffer.


Ouch, O(n^2). I hope you guys are now growing buffers by a factor rather than just incrementing them. Log for the win.

I'm surprised though that heap fragmentation is actually an issue... would it not be required that the allocations be both asymptotically increasing in size, and their deallocations be sporadic? Maybe do a profile to see if things are hanging around too long and try and deallocate them earlier if possible?
Reply #28 Top
Also, anything wrong with static instead of dynamic allocation? If a person is nio technically inclined, they should have windows managing their pagefile and only modders would really be in need of multi-tasking benefist of dynamic mem use. And if a person is a modder, it seems they're be smart enough to adjust page file useage ior probably already have adequate RAM. Just curious bcause it sounds like static assignment would be a lot faster/easier to solve the issue listed above, than what you're looking at now?


It looks like you don't understand what static and dynamic memory refers to. In a program, you have the ability to create variables which take up memory. Without getting into too much detail, you can create a variable two ways, statically and dynamically. The static way means that memory is allocated to the program and stays allocated until the program ends. Using this allows compilers to efficiently utilize space used but also means that that space is permanently used for the life of the program. Dynamically allocating memory is usually used when something is short term and can be destroyed and the memory given back to the system (this is one way for leaks to happen if you forget to deallocate). Lets say you have 10 variables used and to keep it simple they use 1 byte and are never needed at the same time. Statically, they will use 10 bytes for the entire time the program runs. Dynamically they will use 1 byte only when one is being used since there is only 1 of them in existence at any time. There are also issues about where static memory is versus dynamic is stored but Heaps etc are beyond this topic. In the end, they have to do with how memory is used and how much memory is used by a program and don't really have to do with virtual memory and how it is set up by users.


I didn't write what I was thinking too well. I was referring to the fact people are reporting page file usage maxing out at 4+ Gig and Cari mentioning it was because dynamic memory allocation is freeing up mem in RAM, but then it's being written to the pagefile as it fragments. Since I am not a programmer - (I'm a hardware guy) it was my poor attempt to make sense of what's happening and to help.
Reply #29 Top
So the the bottom line is that quick solutions are very often NOT the BEST solutions. So avoid quick solutions which cause your customers unnecessary grief. Secondly testing without lesss than 2 GB pysical memory was a very dumb move. Inexcuseable. I have 1GB memoery and most people have 500 MB. Since I saw posts on how GC II
1.0 was optimized for lesser memory confiurations the same should have applied to your testing process for 1.1.
Assumption is the Mother of all f-ups as the vulgar expression goes and Stardock blew it royally here. Hopefully this will be an OBJECT lesson that Windows blinded developers will learn for a long time to come.
Reply #30 Top
Actually I have always been scared of using defrag. On a large drive it is an overnight job, and if the power goes out your system is pretty much screwed ... no? Please correct me if I am wrong here ...


Dano
Reply #31 Top
I'm surprised though that heap fragmentation is actually an issue... would it not be required that the allocations be both asymptotically increasing in size, and their deallocations be sporadic? Maybe do a profile to see if things are hanging around too long and try and deallocate them earlier if possible?


I'm not sure about the requirements for memory fragmentation. The symptoms seemed to match the problems we're seeing, though, and the changes to how the shipcfg files were read in seemed like a likely candidate. I thnk that we will need to do profiles even with these changes to see what is using all the memory, because GalCiv2 does seem to be using a lot.

So the the bottom line is that quick solutions are very often NOT the BEST solutions. So avoid quick solutions which cause your customers unnecessary grief. Secondly testing without lesss than 2 GB pysical memory was a very dumb move. Inexcuseable. I have 1GB memoery and most people have 500 MB. Since I saw posts on how GC II1.0 was optimized for lesser memory confiurations the same should have applied to your testing process for 1.1.Assumption is the Mother of all f-ups as the vulgar expression goes and Stardock blew it royally here. Hopefully this will be an OBJECT lesson that Windows blinded developers will learn for a long time to come.


os2wiz, would you like some help down from your soapbox? It's this kind of response that makes me not want to write dev journals because there's always someone like you with a holier than thou attitude to make condencending remarks. I don't mind criticism, when it's not handed down in such an insulting manner. If you can't be civil, keep your snotty remarks to yourself.

That being said, yes, we should have tested it on our lower end test machines, which we will be definitely doing before we release another update..
Reply #32 Top
Actually I have always been scared of using defrag. On a large drive it is an overnight job, and if the power goes out your system is pretty much screwed ... no? Please correct me if I am wrong here ...


Hmm, I don't know. I've never had that happen. I may have to go out and buy an automatic power backup now. My new computer has 250 GB hard drives.

Reply #33 Top
So the the bottom line is that quick solutions are very often NOT the BEST solutions. So avoid quick solutions which cause your customers unnecessary grief. Secondly testing without lesss than 2 GB pysical memory was a very dumb move. Inexcuseable. I have 1GB memoery and most people have 500 MB. Since I saw posts on how GC II
1.0 was optimized for lesser memory confiurations the same should have applied to your testing process for 1.1.
Assumption is the Mother of all f-ups as the vulgar expression goes and Stardock blew it royally here. Hopefully this will be an OBJECT lesson that Windows blinded developers will learn for a long time to come.


That seems rather uncalled for. The statement was that the developers and artists computerse were state of the art, not the testers. They also beta tested on HUNDREDS of computers. That is what a PUBLIC beta is for. The reason this slipped through is not due to not testing on lower end systems (this occurs on high end systems as well), but due to the fact that none of the beta testers wrote in about it.

Quick solutions may not be the best, but i'd rather have a quick solution for the short term while they work on a better one that not play the game for three months. Stardock is doing 10 times the support most companies do, and you deserve to be flamed for treating them with such disrespect after that.
Reply #34 Top
Hmm, I don't know. I've never had that happen. I may have to go out and buy an automatic power backup now. My new computer has 250 GB hard drives.

I, unfortunalty had power go out when defragging. Kills the computer, pretty much. Now, I'm a firm believer in large backup power supplies. (Losing 4 term papers sucks)

Reply #35 Top
You are right about that. The real danger is brown outs though, as they can destroy parts not just corrupt files. I have a UPS system with AVR that will run my whole office (5+ 19" CRT monitors) for more than an hour should I need it. Well worth the $250 when you consider that one of those monitors alone is worth more than that.

I would suggest a device from APC's RS series. Excelent devices with plenty of AVR capabilities and extra power for those extra fans and hard drives.
Reply #36 Top
why don't you save the ship configs as binaries instead of parsing ascii files?
Reply #37 Top
Stange that as i have had a power cut in the past whilst defragging the HD. Never had a problem with the pc since. Just fired it up in safe-mode and redid the defrag again. ? ?

BTW Cari is there an ETA for this fix as a lot of my members and others here have now 'hung up' the game until the mem leak is fixed as they say it is not now worth playing. Some even had it happen on a small map too. Personaly i have yet to see this memory error although i do have 1024mb of RAM and 3GB virtual memory allocation on a seperate drive.

Thanks

DG
Reply #38 Top
why don't you save the ship configs as binaries instead of parsing ascii files?


Actually, I guess that would be a better solution in terms of fast i/o than changing to XML if the changes I already made aren't enough. We've usually don't use binary format for data files, though, so that people have an easier time modding them.

BTW Cari is there an ETA for this fix


I may be able to put up an unofficial test build with my changes later today as a link.
Reply #39 Top
I may be able to put up an unofficial test build with my changes later today as a link.

That would be great!
Reply #40 Top
I may be able to put up an unofficial test build with my changes later today as a link.


Looking forward to it as are most of my members

Reply #41 Top
I have had very long save times since before the 1.1 beta. It never crashed though until recently. By the way its not
too difficult to pull out 1GB of memory from the developers computer. It takes about 3 minutes to open the case pop out the the one or two memory modules involved and close the case again. I think my remarks were not only on the money, but also humorously sarcastic. You failed to note the capitalized OBJECT lesson and the "windows -blinded "
developers. I also would enjoy pointing out. That os/2 uses memory FAR more efficiently than windows, and that the same code written to the os/2 platform would not have required the same physical memory and the swap file works far better than the so-called virtual memory of windows. My remarks were actually a triple entendre: a swipe at Stardock on the specific problen, a swipe using the name of Stardock products, and a swipe at the winows platform as still quite inept in certain api's and efficiency of resource utilization.
Reply #42 Top
Cari does the test build rectify the issue with the saved game files not loading? Or is that issue something else completely?

Thought i would ask so you hopefully dont get a hundred people asking the same thing
Reply #43 Top
I'm not sure about the requirements for memory fragmentation. The symptoms seemed to match the problems we're seeing, though, and the changes to how the shipcfg files were read in seemed like a likely candidate. I thnk that we will need to do profiles even with these changes to see what is using all the memory, because GalCiv2 does seem to be using a lot.


Neither am I, but I think sporadic deallocation would be the key to any heap free-space fragmentation. But looking back at your journal you say you deallocate the buffer after parsing, so I'd say something else must be at work. Unless the allocator isn't consolidating the free space blocks? I can see why they might do that for the sake of speed, but I would imagine that would make for a really poor allocator. And even if they did, as long as a ship parse completed once subsequent parses (of the same file) should be able to re-use the same chunk of memory, so they shouldn't contribute to any fragmentation.

I can see why you think of heap fragmentation when the virtual mem is being eaten up, as you'll see that when there's fragmentation, but unless the allocator is brain-dead and not consolidating free blocks there must be a large amount of memory that is staying allocated to cause any free-space fragments. And if that's the case then the real problem is that memory usage is continually increasing... any fragmentation would just cause the allocations to fail sooner. i.e., eliminating them would only delay the allocation failures, not solve them.

Heh, I guess this qualifies as armchair debugging eh? Please keep up with the journals, I for one enjoy reading them.
Reply #44 Top
Before I read the entire list of comments I wanted to add this.

I have a new dual CPU Pentium D 820 2.8 Mhz x2 with 2 Meg ram.

I noticed that the load and save times were dramatically significantly increased when I used the Terran Fleet Pack from the download library and I had a lot of transports and constructors with that extra jewrly running around in a large or better universe in the late game.

1.10 seems to have made this better, but I definately concur that the extra ship components were the severe memory problem. It was very obvious in playing a game with and without those very detailed ship designs. WIthout them the game flew on my machine, with them it started to crawl during the save games and the load times took minutes instead of seconds.
Reply #45 Top
I really do not see the uproar over my comments as being warranted. They were stated forcefully, but not insultingly. Carielf has a very thin skin, indeed, if my caustic comments, which were entirely accurate, get under her skin. There is NO reason for Stardock not to have a handful of computers used for testing other than the state of the art developer's machines. I have participated in betas before for other software firms and most of the better one's do have a few
variable configuration machines for internal testing not a complete reliance on the public testers It is always better
public relations to catch the mistake in house before the release of a finished product. I will not apologize for my comments which were accurate, insiteful, and wryly humorous. I am not in the habit of kissing butt to have my views respected. They stand on their own merit. If you can't take the heat get out of the kitchen.
I do not wish an antagonistic relationship, but I am direct in what I say and I am not going to change that for anyone. Unless of course you have a team of assasins ready to be dispatched to my home. Then slap me silly please and accept my humble apologies. Kill the messenger, kill the truth.
Reply #46 Top
I definately concur that the extra ship components were the severe memory problem. It was very obvious in playing a game with and without those very detailed ship designs. WIthout them the game flew on my machine, with them it started to crawl during the save games and the load times took minutes instead of seconds.


This i concur. As an avid ship designer for the game i agree that too much jewelry on a ship increases the load times for the ship. With the latest batch of ships i created (the USM pack #1) the ships appear in the game but there is a delay in showing them in the ship builder window and in the update screen, this is esp obvious in the constructor that has an additional 30 components on the exterior of the ship as well as mounts for those, the two ships on the top as well as the ship itself. In total there must be close to 100 jewelry parts on the ship alone.

That said i am still getting a huge delay in loading the game even with the ships removed. I was able to reduce this somewhat by setting the mods check box to 'no' so the game is no longer checking the mods directory as well.

As an avid designer, as i have mentioned before, i would like to propose a 'cap' on the amount of jewelry you can use on any one ship. This needs to be no higher than 50 pieces for any one ship as this is sufficient to make a good ship. More than this causes most peoples games to slow to a crawl when you fleet the ships or have more than a dozen or so on the screen at once in normal view.

My $0.02

DG

Reply #47 Top
I will not apologize for my comments which were accurate, insiteful, and wryly humorous. I am not in the habit of kissing butt to have my views respected. They stand on their own merit.

I don't agree to your first but on the other statements. Ah, the irony.
Reply #48 Top
Carielf has a very thin skin, indeed, if my caustic comments, which were entirely accurate, get under her skin.

Maybe you are mistaking the fact that you have a thick skull and she does not? Anyways, anyone who still clings to a dead and buried OS like os/2 really has no point in arguing business matters. And Yes, I have used it before, actually at my place of business we had to use it for several small tasks on some 386 machines up until about 5 years ago? These days the only logical choices is between windows and *nix based OS's. An OS with no vendor support and absolutely no user base has no place anywhere except some die-hard who is still clinging to antiquated software in defiance to the world that has moved on to better solutions. In any case that does not sound like someone who should be giving advice to a person in charge of developing a hugely successful software title.
Reply #49 Top
Your arguement is specious. I never said I primarily or exclusively use os/2 which, under its current flavor, is called eComStation v 1.2 . I primarily use windows XP. I rarely use os/2. That does not alter the fact that its memory management is far superior to anything that has ever come out of Redmond, Washington. And that matter has little relevance to my remarks in the flawed testing conducted for Galciv II v 1.1 . I was not nor am I now attempting to mock
or belittle any individual at Stardock. I think it is one of the better sotware companies and has very good personnel
including Carielf. That does not mean they didn't blunder somewhat in this project . I respect their intelligience and ability, but nevertheless they did blunder. Hopefully some introspection regarding this will help improve their future development. That is how all of us as individuals and as vibrant enterprises improve: by learning from our errors.
After all I would never have achieved my exultant status in life had I not learned from my errors as a mortal. That iis how I have achieved immortality. That is what my victories in Galciv have said in anycase at the end of the games.
Please do not take this seriously. I am actually NOT Beyond Immortality as the game suggests.
Food for thought.
Reply #50 Top
This thread is here for the memory issues at hand, not your short sighted egotistical promotion of your own ideas. Please shut up or focus your attention on the polite and hopefully helpful conversation that existed here before your presence. I'm sure everyone will appreciate it.