Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random app crashes on publish in AOT mode #21908

Open
rolfbjarne opened this issue Jan 7, 2025 · 7 comments
Open

Random app crashes on publish in AOT mode #21908

rolfbjarne opened this issue Jan 7, 2025 · 7 comments
Labels
incremental-builds need-info Waiting for more information before the bug can be investigated
Milestone

Comments

@rolfbjarne
Copy link
Member

From @jdarwood007 on Tue, 31 Dec 2024 19:55:39 GMT

Description

Random app crashes on publish in AOT mode

I am randomly getting app crashes in my app, only in release mode and only after sending to the App store and running in TestFlight on a physical device. Making this very difficult to get debugging information.

The problem seems to go away randomly as well. I can sometimes get the error to go away after a few times of cleaning and building the solution. It is inconsistent in what I do to fix it. However I have determined that even with no code changes in my app, after a few times of rebuilding and cleaning that I can send the exact same code up and have it not crash.

I got a crash report and had all symbols stripped. So I've added this to my .csproj

		<NoSymbolStrip>true</NoSymbolStrip>

This resulted in me being able to get this stack trace:

0   libsystem_kernel.dylib        	       0x1bc94fbbc __pthread_kill + 8
1   libsystem_pthread.dylib       	       0x1dd4bc844 pthread_kill + 208
2   libsystem_c.dylib             	       0x18c7786ac abort + 124
3   MyApp                         	       0x102869720 sigabrt_signal_handler.cold.1 + 48
4   MyApp                         	       0x1027dc6e4 sigabrt_signal_handler + 196
5   libsystem_platform.dylib      	       0x1dd4a4d48 _sigtramp + 52
6   libsystem_pthread.dylib       	       0x1dd4bc844 pthread_kill + 208
7   libsystem_c.dylib             	       0x18c7786ac abort + 124
8   MyApp                         	       0x102699d88 mono_log_write_os_log + 408
9   MyApp                         	       0x10268f4b4 monoeg_g_logv + 172
10  MyApp                         	       0x10268f5fc monoeg_g_log + 28
11  MyApp                         	       0x1027b48e4 load_aot_module + 4260
12  MyApp                         	       0x1026f04e8 mono_assembly_request_load_from + 1108
13  MyApp                         	       0x1026f0004 mono_assembly_request_open + 472
14  MyApp                         	       0x1026f1ca8 mono_assembly_open + 80
15  MyApp                         	       0x1026326b4 xamarin_assembly_preload_hook + 452
16  MyApp                         	       0x1026f1f54 invoke_assembly_preload_hook + 76
17  MyApp                         	       0x1026ef760 mono_assembly_request_byname + 1012
18  MyApp                         	       0x1027be9d4 load_image + 280
19  MyApp                         	       0x1027b487c load_aot_module + 4156
20  MyApp                         	       0x1027b93a0 mono_aot_get_method + 316
21  MyApp                         	       0x1027a7a60 jit_compile_method_with_opt + 564
22  MyApp                         	       0x1027ac034 mono_jit_runtime_invoke + 480
23  MyApp                         	       0x102753ae8 mono_runtime_invoke_checked + 148
24  MyApp                         	       0x10270e174 create_exception_two_strings + 592
25  MyApp                         	       0x10270def8 mono_exception_from_name_two_strings_checked + 148
26  MyApp                         	       0x1026ebf74 mono_runtime_init_checked + 712
27  MyApp                         	       0x1027ab910 mini_init + 6660
28  MyApp                         	       0x1027b1b60 mono_jit_init_version + 20
29  MyApp                         	       0x102631bd8 xamarin_bridge_initialize + 164
30  MyApp                         	       0x102632ac4 xamarin_main + 648
31  MyApp                         	       0x102825a00 main + 64
32  dyld                          	       0x1070c44d0 start + 444

It seems the app is hitting mono_exception_from_name_two_strings_checked early on in the startup of my app, as indicated the code is in the main and xamarin triggered the faulty code.

I hooked up the device to a Mac mini and was able to get the console to present the following error message

error: Failed to load AOT module 'Microsoft.Maui.Controls' while running in aot-only mode because a dependency cannot be found or it is out of date.

This seems to back the results of what I am experiencing in that the crash is occurring randomly and that cleaning fixes it, by possibly removing that out of date dependency.

To resolve the crashes, I have done the following, all sometimes work and at other times does not resolve the issue.

  1. Build > Clean Solution
  2. Delete the bin and obj folders
  3. Close Visual Studio, Delete the contents of C:\Users%me%\AppData\Local\Temp\Xamarin and /Users/%me%/Library/Caches/Xamarin/XMA/SDKs (on mac)
  4. Close and shutdown windows and mac. Return next day.

I don't have any project to provide, as it is random and by luck I feel, will just work for a bit. I've seen this on a few projects I work with and the resolution has always been the same, try to clean things and rebuild until it works.

I have noticed that more often than not, the problem seems to exist after I deployed a the app to the device via TestFlight , if I plug it into the mac and deploy the app in debug mode, unplug device, clean, rebuild and publish the app into TestFlight and install the TestFlight version, it will crash.

Steps to Reproduce

No response

Link to public reproduction project repository

No response

Version with bug

9.0.21 SR2.1

Is this a regression from previous behavior?

Not sure, did not test other versions

Last version that worked well

Unknown/Other

Affected platforms

iOS

Affected platform versions

Any

Did you find any workaround?

No response

Relevant log output

Copied from original issue dotnet/maui#26890

@rolfbjarne
Copy link
Member Author

From @mattleibow on Sat, 04 Jan 2025 14:18:54 GMT

@rolfbjarne thoughts?

@rolfbjarne
Copy link
Member Author

From @mattleibow on Sat, 04 Jan 2025 14:22:56 GMT

@jdarwood007 not sure if it matters, but are you using the interpreter? Have you set the <UseInterpreter>true</UseInterpreter> in your csproj?

@rolfbjarne
Copy link
Member Author

@rolfbjarne
Copy link
Member Author

From @jdarwood007 on Sat, 04 Jan 2025 15:44:23 GMT

@mattleibow I do not have it defined in my .csproj, so no. Although a quick read on the documentation seems that this may be the reason because I use generics. I consume a REST API returning JSON results and it uses a standardized return with a property that has a defined type (so my return is something like StandardResponse<Student>).

I did enable <MauiEnableXamlCBindingWithSourceCompilation>true</MauiEnableXamlCBindingWithSourceCompilation> during testing of this issue and realized I had unsafe bindings and even got errors to indicate that I needed to use JsonSerializerContext. I never realized these issues as calls worked as expected with no problems in debug mode when initially written.

However, I can get it to build and run successfully randomly and even have releases. My thoughts here are that something on the Visual Studio or Mac OS side is being held and retained during the build from a previous build and causing this. It is possible that a file is being cached improperly, a file stat call is returning cached results or a file is not recognized as needing to be rebuilt.

For the linked issues, the first issue mentions dotnet clean. I came to the same conclusion. At some points, doing that resolves the issue. But the source of the problem remains a mystery. Why does not a 'build' not properly do this?
The first issue does say that the interpreter is enabled by default on debug builds. I am using a M1 Mac mini, so this is helpful and seems close to my issue.

The second issue was resolved with an update, but the change message indicates a different function than the one I'm seeing crashing. That is we are seeing mono_runtime_init_checked called and then it reaches and exception mono_exception_from_name_two_strings_checked. Indicating the failure during mono_runtime_init_checked was reached.

The third linked issue isn't relevant, as their crash is caused by code they directly wrote, whereas from what I can tell, my issue occurs early on in the startup process, before my code can even execute.

@rolfbjarne
Copy link
Member Author

I hooked up the device to a Mac mini and was able to get the console to present the following error message

error: Failed to load AOT module 'Microsoft.Maui.Controls' while running in aot-only mode because a dependency cannot be found or it is out of date.

This seems to back the results of what I am experiencing in that the crash is occurring randomly and that cleaning fixes it, by possibly removing that out of date dependency.

Yes, this certainly looks like a problem with incremental builds, where we don't correctly identify what needs to be rebuilt and what doesn't (which is why a clean build works: then everything will be rebuilt).

The annoying part here is that these incremental build issues can be rather hard to track down, because we need very specific instructions in order to reproduce them: provide a test project (or your actual project if that's easier for you), and then something like:

  1. (start from a clean slate)
  2. Build project in VS by clicking the "build" button
  3. Edit file X.cs at line 123 to say "Hello World"
  4. Build project again in VS by clicking the "build" button
  5. Run on device.

@rolfbjarne rolfbjarne added the need-info Waiting for more information before the bug can be investigated label Jan 7, 2025
@rolfbjarne rolfbjarne added this to the Future milestone Jan 7, 2025
@jdarwood007
Copy link

The app is an internal app used for the business. I tested on another of our internal apps, currently on .net 8, and even after upgrading to .net 9 and doing testing, it worked flawlessly. Yet the first app (in this report) gave me trouble for weeks with random crashes. I even tried to mess with various build settings to determine if it was something during the linker doing something and accidentally stripping out a needed component.

Unfortunately, building in dev and release mode has worked fine for the past week now. I don't have a build to send to test flight to verify if its still doing it. Oddly enough though, when I would build for release, it would sometimes work, but then sending it up to testflight and back down to the device would crash.

Are there additional debugging steps I can try to do in order to produce more verbose information? Can I get the compiler to indicate what libraries/components it determines will not need to be updated and thus are skipped in the build output log? I don't see any from the output myself.

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention An issue requires our attention/response and removed need-info Waiting for more information before the bug can be investigated labels Jan 8, 2025
@rolfbjarne
Copy link
Member Author

I even tried to mess with various build settings to determine if it was something during the linker doing something and accidentally stripping out a needed component.

Any problems with the linker would be consistent - i.e. cleaning wouldn't fix it.

Are there additional debugging steps I can try to do in order to produce more verbose information? Can I get the compiler to indicate what libraries/components it determines will not need to be updated and thus are skipped in the build output log? I don't see any from the output myself.

Kind of, if you get an MSBuild binlog (https://github.com/xamarin/xamarin-macios/wiki/Diagnosis#binary-build-logs), it will typically be possible to deduce this by inspecting the binlog. However, the build process is rather complex, so it's not trivial. If you can get such a binlog, we could have a look and see if we find anything useful (note that figuring out that library X wasn't rebuilt correctly is only half the picture - we also need to know why library X should have been rebuilt, which means we need to know exactly what you did in your project).

@rolfbjarne rolfbjarne added need-info Waiting for more information before the bug can be investigated no-auto-reply For internal use and removed need-attention An issue requires our attention/response labels Jan 8, 2025
@microsoft-github-policy-service microsoft-github-policy-service bot removed the no-auto-reply For internal use label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
incremental-builds need-info Waiting for more information before the bug can be investigated
Projects
None yet
Development

No branches or pull requests

2 participants