Do you think that AI coding assistants are not working for you? You constantly get wrong responses and now you have given up using them? In this blog, some real life use cases are shown where AI coding assistants are helpful and will help you during your daily work. Enjoy!

1. Introduction

Nowadays, many AI coding assistants are available. These are demonstrated during conferences, in videos, described in blogs, etc. The demos are often impressive and it seems that AI is able to generate almost all of the source code for you. You only need to review it. However, when you start using AI coding assistants at work, it just seems that it is not working for you and it only costs you more time. The truth lies somewhere in between. AI coding assistants can save you a lot of time for certain tasks, but they also have some limitations. It is important to learn which tasks will help you and how to recognize when you hit the limits of AI. Beware that AI is evolving in a fast pace, so the limitations of today may be resolved in the near future.

In the remainder of this blog, some tasks are executed with the help of an AI coding assistant. The responses are evaluated and different techniques are applied which can be used to improve the responses when necessary. This blog is the second of a series, this part will focus on generating code. The first part can be read here.

The tasks are executed with the IntelliJ IDEA DevoxxGenie AI coding assistant.

Two setups are used in this blog, the used setup will be clearly mentioned:

  1. Ollama as inference engine and qwen2.5-coder:7b runs as a model. This runs on CPU only.
  2. LMStudio as inference engine and qwen2.5-coder:7b runs as a model. This runs on GPU only.

As you can see, local running models are used. Reason for doing so is that you will hit the limits of a model earlier.

The reason for using two setups is because the tasks were started with setup 1. The assumption was that the only difference between CPU and GPU would be performance. However, after a while, it was discovered that a model running on a GPU also provides better responses.

The sources used in this blog are available at GitHub.

2. Prerequisites

Prerequisites for reading this blog are:

  • Basic coding knowledge,
  • Basic knowledge of AI coding assistants,
  • Basic knowledge of DevoxxGenie, for more information you can read a previous blog or watch the conference talk given at Devoxx.

3. Task: Generate Javadoc

The goal is to see whether AI can be helpful in generating javadoc.

The setup LMStudio, qwen2.5-coder, GPU is used.

The source file Refactor.java does not contain any javadoc.

3.1 Prompt

Open the file and enter the following prompt.

write javadoc for public classes and methods in order that it clearly explains its functionality

3.2 Response

The response can be viewed here.

3.3 Response Analysis

The response is quite good and useful:

  • The generated javadoc is correct;
  • The javadoc for the methods could be a bit more elaborate;
  • A constructor is added which was not asked to do so;
  • The prompt asked to write javadoc for public classes and methods only, but this instruction is ignored.

4. Task: Generate Names

The goal is to see whether AI can be helpful with generating names for classes, methods, variables, etc.

There are only two hard things in Computer Science: cache invalidation and naming things. – Phil Karlton

The setup LMStudio, qwen2.5-coder, GPU is used.

The source file Refactor.java contains some names which can be improved.

4.1 Prompt

Open the file and enter the following prompt.

give 3 suggestions for a better method name for the parseData methods

4.2 Response

The response can be viewed here.

4.3 Response Analysis

The suggestions are quite good and an improvement over the current names.

4.4 Prompt

Let’s find out whether a better name for the class can be found.

Open the file and enter the following prompt.

give 3 suggestions for a better class name for the Refactor.java class

4.5 Response

The response can be viewed here.

4.6 Response Analysis

The suggestions are quite good, but also some Example Usage is given which wasn’t really asked to do so.

4.7 Prompt

Enter the same prompt, but change the temperature to 0.7 in the DevoxxGenie LLM settings. This instructs the LLM to be more creative.

Open the file and enter the prompt.

give 3 suggestions for a better class name for the Refactor.java class

4.8 Response

The response can be viewed here.

4.9 Response Analysis

For some reason, the response is very short now. But the suggestions seem to be better than with a temperature of 0.

5. Task: Generate Docker Compose File

The goal is to see whether AI can be helpful in converting a docker run command into a Docker Compose file.

The setup Ollama, qwen2.5-coder, CPU is used.

Open WebUI is an offline AI interface, more info can be read in a previous blog. Open WebUI can be started by means of the following docker command.

docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main

5.1 Prompt

Open the file and enter the prompt.

create a docker compose file for this command

5.2 Response

The response can be viewed here.

5.3 Response Analysis

The response is correct. Besides that, an explanation is given about the different components in the Docker Compose file and how it can be executed.

6. Task: Generate Cron Expression

The goal is to see whether AI can be helpful in generating Spring Scheduling Cron expressions. Note that Spring Scheduling Cron expressions differ from regular crontab expressions. An extra component in order to define the seconds is available in Spring Scheduling Cron expressions.

The setup LMStudio, qwen2.5-coder, GPU is used.

6.1 Prompt

Enter the prompt.

generate a spring scheduling cron expression which runs every 3 days at 0 AM, but not on Sundays

6.2 Response

The response can be viewed here.

6.3 Response Analysis

The LLM does know how to format a valid Spring Scheduling cron expression. The cron expression will run always on Monday, Wednesday, Friday and Saturday. That is not what we meant, because when it runs on Friday, it should not run on Saturday.

6.4 Prompt

Write a follow-up prompt.

It will run always on Friday and Saturday, this is not every 3 days. The task should only run every 3 days, excluding (thus skipping) Sundays.

6.5 Response

The response can be viewed here.

6.6 Response Analysis

No real improvement, the expression has changed in such a way that it now runs every three hours. The explanation the LLM gives, does not correspond to this new cron expression.

6.7 Prompt

This challenge is quite difficult because it is a tricky one: you want to run it every three days, but not on Sundays. But what to do when the next schedule would run on Sunday? Do you want it to just skip Sunday or do you want it to run the next day (meaning Monday)?

Let’s try another prompt indicating that Sundays should be skipped. Create a new chat window in order to start anew.

generate a spring scheduling cron expression which runs every 3 days at 0 AM, Sundays should be skipped.

6.8 Response

The response can be viewed here.

6.9 Response Analysis

This response is correct this time. And it also shows the problem it was struggling with: it depends in this case which day you want to start, so the LLM is able to create a cron expression which satisfies the requirements.

The conclusion is that it does matter which words, phrasing you use in the prompt. And sometimes it is better to start anew with a new prompt instead of iterating on previous responses.

7. Task: Refactor Code

The goal is to see whether AI can be helpful with refactoring code.

The setup LMStudio, qwen2.5-coder, GPU is used.

Take a look at the processMessage method in the Refactor.java class. This code screams for refactoring. Let’s see how AI can help us with that.

7.1 Prompt

Add the entire code directory to the Prompt Context. Enter the following prompt.

/review

DevoxxGenie will expand this to the following prompt.

Review the selected code, can it be improved or are there any bugs?

7.2 Response

The response can be viewed here.

7.3 Response Analysis

Some general improvements are suggested, but overall the code did not improve a lot.

7.4 Prompt

Open a new chat window and add the source directory as Prompt Context again. Provide more clear instructions of what you expect. This is based on AI-Assisted Software Development.

Open a new chat window. Add the entire code directory to the Prompt Context. Enter the following prompt.

Please review the following code for quality and potential issues: Refactor.processMessage(RefactorMessage refactorMessage) 
In your review, please consider: 
1. Code style and adherence to best practices 
2. Potential bugs or edge cases not handled 
3. Performance optimizations 
4. Security vulnerabilities 
5. Readability and maintainability 

For each issue found, please: 
1. Explain the problem 
2. Suggest a fix 
3. Provide a brief rationale for the suggested change 

Additionally, are there any overall improvements or refactoring suggestions you would make for this code?

7.5 Response

The response can be viewed here.

7.6 Response Analysis

This response seems to be better:

  1. It is seen that the method is too long and too complex, this is a good conclusion.
  2. The input variable refactorMessage should be checked for null, this is correct.
  3. The LLM suggests to use a DateTimeFormatter to parse the date, but in this specific situation, this is not applicable. The occurrenceTime is a range separated by a slash, e.g. “2014-03-01T13:00:00Z/2015-05-11T15:30:00Z”.
  4. It suggests to validate the input data, which is a good suggestion.
  5. A suggestion is given to use early returns, but the fix given is already present in the code.
  6. Also overall improvements and suggestions are given (e.g. to write unit tests).

7.7 Prompt

The most interesting suggestion is to break the code down into smaller, more manageable methods. However, no real fix was suggested. So, let’s enter a follow-up prompt.

Break the method down into smaller more manageable methods

7.8 Response

The response can be viewed here.

7.9 Response Analysis

The suggested code is not entirely correct.

  1. findSingleData and findMultiData are not exactly improvements.
  2. parseOccurrenceTime is not parsing the same way as in the original code.
  3. parseLocation makes uses of a constructor with arguments which do not exist.
  4. parseData makes also use of constructors with arguments which do not exist.

However, the overall simplification and readability improvements do make sense.

7.10 Prompt

Let’s use Anthropic Claude 3.5 Sonnet to verify whether an cloud LLM is able to produce better results.
Open a new chat window and add the entire code directory to the Prompt Context. Use the same prompt as before.

Please review the following code for quality and potential issues: Refactor.processMessage(RefactorMessage refactorMessage)
In your review, please consider:
1. Code style and adherence to best practices 
2. Potential bugs or edge cases not handled 
3. Performance optimizations 
4. Security vulnerabilities 
5. Readability and maintainability

For each issue found, please:
1. Explain the problem 
2. Suggest a fix 
3. Provide a brief rationale for the suggested change

Additionally, are there any overall improvements or refactoring suggestions you would make for this code?

7.11 Response

The response can be viewed here.

7.12 Response Analysis

Some good suggestions are given:

  1. Complex conditional logic, this is correct and should be changed.
  2. The method is too long and should be split into smaller methods, that is correct.
  3. Some problems with null references and input validation are found, that is correct.
  4. Some overall suggestions are given.

Overall, the suggestions are of a similar level compared to a local LLM.

7.13 Overall Conclusion For Task Refactor Code

AI can give good suggestions and improvements for existing code. However, the code cannot be taken as-is into your code base. It is better to apply some of the suggestions yourself and prompt again to see whether the code has improved.

The suggestions between a cloud provider or a local LLM running on a GPU do not differ very much from each other.

8. Conclusion

From the examples used in this blog, the following conclusions can be drawn:

  1. AI coding assistants are good at generating javadoc, generating names, small coding tasks.
  2. By default, it is wise to set the temperature to 0 for coding tasks (0 instructs the LLM to be factual). However, for more creative tasks like generating names, it is advised to increase the temperature in order to get better results.
  3. Iterating on a previous prompt provides more context to the LLM and provides better results.
  4. Changing words can make a difference (see the Spring Scheduling Cron expression example) when prompting. Sometimes it is better to start anew (without chat history) but with different phrasing in order to get better responses.
  5. Start with short prompts is often an easy way to interact with an LLM. But do not hesitate to give clear instructions about what you expect in a response. Some good software development prompts can be found in AI-Assisted Software Development.
  6. Always review and understand the responses. Suggested code fixes are often not entirely correct but do point you into the right direction.
  7. Nowadays, local LLM’s are able to generate responses closely to the level of cloud LLM’s.


Discover more from

Subscribe to get the latest posts sent to your email.