acknex debugging questions

Posted By: Kartoffel

acknex debugging questions - 02/15/18 11:57

We're currently experiencing some crashes and are having difficulties finding out what causes these (I have a feeling they're memory related).

Are there any ways to catch errors and execute user defined code when they occur (before the program exits)? We'd like to dump some vars to narrow down the problem(s).

Also, are there any pdb debugging info files for the engine? That would make looking at stack traces probably more helpful.

regards, Kartoffel
Posted By: jcl

Re: acknex debugging questions - 02/15/18 14:29

Depends on the crash. Do you get an engine error or does the engine really crash, with a bluescreen or Windows error?
Posted By: Kartoffel

Re: acknex debugging questions - 02/15/18 14:43

so, we either get script errors in various functions (we suspect that maybe some functions keep using pointers that are not valid anymore) or a "normal" windows crash due to an access violation.

it happens during / after loading a new level.
Posted By: jcl

Re: acknex debugging questions - 02/15/18 14:57

Ok, then start with -diag and check the log. You should then see at which part of level loading the problem happens.
Posted By: Kartoffel

Re: acknex debugging questions - 02/15/18 15:25

I gave it a try but sadly it's pretty unpredictable.

Most of the time it results in a script crash. If that happens, sometimes the engine exits correctly (log ends with "Close window at ..."), other times it crashes right after the error (the last line in the log in that case only says "Error E1513: Script crash in ...: " and that's it)

When the level_load() crashed I think there also was a "bad file format" error. I guess something writes to memory that it shouldn't write to, overwriting parts of the the file's content while loading, in that case.

We're currently trying to make sure that no functions which use entities or other level content are running before loading a new map. It's a bit tricky though, since it looks like the engine automatically terminates actions that are set on ent_create().
Posted By: jumpman

Re: acknex debugging questions - 02/15/18 15:54

Hello Kartoffel,

I've had "script crash in ...", at seemingly random points as well, and it was in a pathfinding AI. What I did to narrow down the issue, was to program the AI to run around the level forever/automatically so I could get the crash eventually, since it was random/unpredictable.

What ended up being the culprit was me using a VECTOR incorrectly, Im really sorry I dont remember the exact problem, but it was relating to variable memory in a parallel function relating to getting the AI to shove each other if they got too close to each other, and this vector being used in a way it shouldnt have been. I think at some point it may have created garbage values, used by another function which eventually crashed the game.

The good news is that script crashes almost always is your own fault instead of the engine, which means you can fix it! The bad news is that script crashes are your own fault, so you need to sift through your code to figure out whats going on.

Copy your current games working folder, and in this copied folder, comment out and remove every extraneous code outside of the main and level loading script. Comment out the extra includes, and see if your game crashes while level loading. Script an autoreload loop in your main: every 2 seconds, reload the current level, and you will get your crashes eventually, this will help you narrow down what the problem is. Comment out functions that run on level load, and introduce them back, one by one. Functions that use entities or level content that are not initialized should give you an empty pointer error, not script crashes.

Use the level_ent entity, and check to see if this entity is NULL, to determine whether a function should be running only after the level is loaded.

My hunch is that its definitely memory related, but something seemingly tiny like a misused variable, which would corrupt the memory area when another function tries to access garbage data.
Posted By: Kartoffel

Re: acknex debugging questions - 02/15/18 16:18

thanks for the replies so far

@jumpman we're currently trying to narrow it down. The problem is that the project is somewhat complex and simply commenting out parts of it doesn't really work without making bigger changes to it.
For now I'll focus on trying to make sure that no unwanted functions survive the level_load(). It might be AI or other gameplay related functions that are still running after unloading the level and keep using "freed" memory until it's assigned to something else (which makes it crash).

We're not 100% that this is the case, though.
Posted By: Superku

Re: acknex debugging questions - 02/15/18 17:46

A few ideas:

1) Have you thrown in a ton of sys_marker("ABC" + NULL) in your code yet?

2) You are using the "newest" A8 version, right? I think the last time I've seen a bad/ wrong file format error was before the bugfix in Version 8.46, http://manual.3dgamestudio.net/bugs.htm

3) ent_decal may fail if you use that one. Took me forever to find a random crash in a game I once made for my father, it was because of accessing an invalid/ NULL PARTICLE pointer returned by ent_decal if I'm not mistaken.

4) You're probably using something like that already but the following console has helped me a ton the last couple of years:
Click to reveal..
Code:
// made by Rackscha, adapted by Superku

#ifndef Console_h
	#define Console_h
	#include<windows.h>;

	long WINAPI WriteConsole(int Handle, char* Buffer, int CharsToWrite, int* CharsWritten, int reserved);
	long WINAPI CreateConsoleScreenBuffer(long dwDesiredAccess, long dwShareMode, long *lpSecurityAttributes, long dwFlags, long lpScreenBufferData);
	long WINAPI SetConsoleActiveScreenBuffer(long hConsoleOutput);
	long GConsoleBuffer;
	int consoleInitialized = 0;
	int consolePrintTrue = 1;
	int consolePrintTarget = 2; // write in both acklog and console

	void consoleInit()
	{
		if(consoleInitialized) return;
		consoleInitialized = 1;
		AllocConsole();
		GConsoleBuffer = CreateConsoleScreenBuffer(GENERIC_WRITE, FILE_SHARE_READ, 0, CONSOLE_TEXTMODE_BUFFER, 0);
		SetConsoleActiveScreenBuffer(GConsoleBuffer);	
	}

	void cdiag(char* AText)
	{
		if(!consolePrintTrue) return;
		if(consolePrintTarget) diag(AText);
		if(consolePrintTarget == 0 || consolePrintTarget == 2)
		{
			if(!consoleInitialized) consoleInit();
			WriteConsole(GConsoleBuffer, AText, str_len(AText), NULL, 0);
		}
	}
	
	#define cprintf0(str) cdiag(_chr(str))
	#define cprintf1(str,arg1) cdiag(_chr(str_printf(NULL,str,arg1)))
	#define cprintf2(str,arg1,arg2) cdiag(_chr(str_printf(NULL,str,arg1,arg2)))
	#define cprintf3(str,arg1,arg2,arg3) cdiag(_chr(str_printf(NULL,str,arg1,arg2,arg3)))
	#define cprintf4(str,arg1,arg2,arg3,arg4) cdiag(_chr(str_printf(NULL,str,arg1,arg2,arg3,arg4)))
	#define cprintf5(str,arg1,arg2,arg3,arg4,arg5) cdiag(_chr(str_printf(NULL,str,arg1,arg2,arg3,arg4,arg5)))
	#define cprintf6(str,arg1,arg2,arg3,arg4,arg5,arg6) cdiag(_chr(str_printf(NULL,str,arg1,arg2,arg3,arg4,arg5,arg6)))
	#define cprintf7(str,arg1,arg2,arg3,arg4,arg5,arg6,arg7) cdiag(_chr(str_printf(NULL,str,arg1,arg2,arg3,arg4,arg5,arg6,arg7)))
#endif

Use:
cprintf1("nfoo() START at frame %d...",(int)total_frames);



5) I'm not entirely convinced PROC_GLOBAL is working correctly 100% of the time. I've had rare and seemingly random occurrences of entity functions continuing after level_load which neither set PROC_GLOBAL nor called another function which set that function state. EDIT: I've pretty much removed all PROC_GLOBAL states and restructured my code accordingly.
6) This may or may not be/ have been related to 5) but I've had issues/ errors with enemies falling into oblivion and beyond the level(_ent) borders and level_load.
Posted By: Kartoffel

Re: acknex debugging questions - 02/15/18 18:01

thanks Superku!

we do have a console but we ended up not really using it smirk
I can pretty much pinpoint when the crash happens, though (like I said, when loading a new level). I'm just not sure what causes it to crash. Again, I think it's memory corruption but I'm not sure.

Another thing: it looks like these crashes only happen on windows 10 so far.
If it's memory-related, I still think it's something produced by the code, though. The different operating systems might handle memory allocation/management differently which makes the crashes we're having more likely on win 10.
© 2024 lite-C Forums